A Level: Data Science Using Python (A10.1-R5, NIELIT / DOEACC, Live Classes)
Data science is an interdisciplinary field that uses scientific processes and various algorithms to extract knowledge and insights from data that may be structured and unstructured.
Python has gathered a lot of interest recently as a choice of language for data analysis/science. Python is a free and open-source and a general-purpose programming language that is easy to learn. Python, due to its versatility, is ideal for implementing the steps involved in data science processes.
Python is being used for web development, data analysis, artificial intelligence, and scientific computing. The three best and most important Python libraries for data science are NumPy, Pandas, and Matplotlib. NumPy and Pandas are used for analyzing and exploring data. Matplotlib is a data visualization library used for making various types of graphs depicting the analysis.
With the growth in the IT industry, there is a booming demand for skilled Data Scientists, and Python has evolved as the most preferred programming language for the same. This course will focus on fundamental python programming techniques, reading and manipulating CSV files, and the various libraries for data science.
After completing the module, the student will be able to:
- Take tabular data and clean it
- Manipulate the data
- Run basic inferential statistical analyses.
- Perform Data Analysis
- Perform Visualization of analysis
- Built a Front end GUI
120 Hours – (Theory: 48 hrs + Practical: 72 hrs)
(i) Python Language, Structures, Programming Constructs
Review of Python Language, Data types, variables, assignments, immutable variables, Strings, String Methods, Functions and Printing, Lists and its operations, Tuples and Dictionaries programs, Slicing strings, lists, tuples.
(ii) Data Science and Analytics Concepts
What is Data Science and Analytics? The Data Science Process, Framing the problem, Collecting, Processing, Cleaning and Munging Data, Exploratory Data Analysis, Visualizing results.
(iii)Introduction to NumPy Library
Numpy: Array Processing Package, Array types, Array slicing, Computation on NumPy Arrays – Universal functions, Aggregations: Min, Max, etc., N-Dimensional arrays, Broadcasting, Fancy indexing, sorting arrays, loading data in Numpy from various formats.
(iv) Data Analysis Tool: Pandas
Introduction to the Data Analysis Library Pandas, Pandas objects – Series and Data frames, Data indexing and selection, Nan objects, Manipulating Data Frames, Grouping, filtering, Slicing, Sorting, Ufunc, Combining Datasets- Merge and join. Query Data Frame structures for cleaning and processing, lambdas. Aggregation functions and applying user-defined functions for manipulations.
(iv) Statistical Concepts and Functions
Statistics module, manipulating statistical data, calculating results of statistical operations. Python Probability Distribution, Functions like mean, median, mode, and standard deviation. Concept of Correlation and Regression.
Visualization with Matplotlib, Simple line plots, scatter plots, Density and Contour plots – visualizing functions, Multiple subplots, Plotting histograms, bar charts, scatter graphs, and line graphs.
(vi) GUI – Tkinter
Tk as Inbuilt Python module creating GUI applications in Python. Creating various widgets like button, canvas, label, entry, frame, check button, label, etc. Geometry Management: pack, grid, place, organizing layouts, and widgets, binding functions, mouse-clicking events. Building the complete interface of a project.
(vii) Machine Learning: The Next Step
What is Machine Learning? Types of Machine Learning Algorithms, Training the data, and Introduction to Various Learning Algorithms. Applications of Machine Learning.
No Reviews found for this course.