What is a Library?
In school or college, when we have doubts, we go to the library and read books to understand concepts.
- In programming, a library is a collection of ready-made functions and tools.
- Instead of writing everything from scratch, we use libraries.
- If we try to write every function ourselves, it will take years!
- So developers created libraries to save time.
Important Libraries in Machine Learning
NumPy
- Used for numerical operations
- Works with arrays and numbers
Pandas
- Used for handling data
- Very useful for:
- Data loading
- Data cleaning
- Data wrangling
- EDA (Exploratory Data Analysis)
Matplotlib
- Used for data visualization
- Helps create graphs and charts
Scikit-learn
- Used for Machine Learning algorithms
- Helps:
- Train models
- Test models
- Calculate metrics (MSE, Accuracy, etc.)
Keras & TensorFlow
- Used for Deep Learning
- Helpful for Neural Networks
Installation of Library
If a library is not installed, we cannot import it.
Installation Command
!pip install pandas
Explanation
- ! → Used in Jupyter to run system command
- pip → Package installer
- install → Command
- pandas → Library name
- If already installed, it will show: Requirement already satisfied
Step 2: Importing the Library
- After installation, we import the library.
- Basic Import: import pandas
- Import with Alias (Recommended): import pandas as pd
Why Use "as pd"?
- Pandas is a long name
- We shorten it to pd
Now we can use:
pd.read_csv()
Instead of:
pandas.read_csv()
This saves time.
Loading Dataset Using Pandas
To perform data wrangling, we need data.
Method 1: Load CSV File
- Syntax: data = pd.read_csv("file_path")
- Example: data = pd.read_csv("C:\\Users\\Pradeep\\Desktop\\data.csv")
- In Windows: Use double backslash \\
- In Linux: Use single slash /
To Display Data
data
