Course Description

This course aims to teach methods for handling unstructured data mining, which is at the core of big data. Since modern data environments utilize both structured and unstructured data together, students will review the basics of structured data analysis and expand to unstructured data mining. Students will learn various unstructured data analysis techniques including Python programming basics, web crawling for data collection, text mining, sentiment analysis, and social media analysis.

Learning Objectives

  • Review and deepen understanding of structured data analysis techniques
  • Understand the characteristics and analytical methodologies of unstructured data
  • Acquire skills in unstructured data collection and preprocessing using Python
  • Learn key unstructured data analysis techniques such as text mining, sentiment analysis, and social media analysis
  • Understand integrated analysis methods for structured and unstructured data
  • Strengthen data analysis capabilities through practice-oriented projects

Evaluation

Midterm
25%
Final
25%
Project
25%
Attendance
15%
Participation
10%

Weekly Schedule

Week Topic Details
1 Course Introduction and Big Data Concepts Course overview, big data concepts and importance, differences between structured/unstructured data
2 Python Programming Basics (1) Python syntax fundamentals, data structures, NumPy basics
3 Python Programming Basics (2) Pandas, Matplotlib, Seaborn data visualization
4 Advanced Structured Data Analysis Descriptive statistics, exploratory data analysis, correlation analysis
5 Introduction to Web Data Collection Understanding web structure, Requests, BeautifulSoup
6 Practical Web Data Collection Public data, Open APIs, web scraping ethics
7 Text Preprocessing and Analysis Basics Text preprocessing, KoNLPy usage
8 Midterm Exam
9 Gaining Insights from Text Word frequency analysis, word clouds, TF-IDF
10 Understanding Opinions through Sentiment Analysis Sentiment analysis, review analysis, social media sentiment analysis
11 Discovering Hidden Topics in Documents Topic modeling, news article analysis
12 Understanding Data through Network Relationships Network analysis, graph theory, social media influence analysis
13 Analyzing Multiple Data Types Together Text-numeric integrated analysis, image data basics
14 Data Storytelling Workshop Data storytelling, effective visualization
15 Team Project Presentation and Final Exam Final presentations and evaluation

Textbook

Self-developed PPT materials