15/04/2026

What is Software Engineering

What is Software Engineering

Meaning of Software Engineering

Software Engineering is made of two words:

  • (A) Software
  • (B) Engineering

(A) Software

Software means a collection of programs, data, and instructions used to perform tasks.

Examples:
  • WhatsApp
  • MS Word
  • Banking App
  • Games
  • ATM Software

(B) Engineering

Engineering means applying scientific knowledge, methods, tools, and techniques to solve problems.

Final Meaning

Software Engineering is the branch of engineering that deals with designing, developing, testing, maintaining, and improving software using proper methods.

Definition (Important for Exam)

Roger Pressman Definition:

Software Engineering is the systematic, disciplined, and measurable approach for development, operation, and maintenance of software.

2. Why Software Engineering is Needed?

Earlier software was developed without planning, causing:

  • Late delivery
  • High cost
  • Errors/Bugs
  • Poor quality
  • Difficult maintenance
So software engineering gives:
  • Proper planning
  • Better quality
  • Low cost
  • Fast development
  • Easy maintenance

3. Characteristics of Good Software

A good software must have these qualities:

  • 1. Correctness
  • 2. Reliability
  • 3. Efficiency
  • 4. Usability
  • 5. Maintainability
  • 6. Portability
  • 7. Security
  • 8. Scalability
  • 9. Reusability

4. Nature of Software

Software is different from hardware.

1. Software is Developed, not Manufactured
Hardware is manufactured in factories.
Software is created by coding.

2. Software Does Not Wear Out
Hardware gets old.
Software does not physically damage.

3. Software is Custom Built
Many software are developed as per customer needs.

4. Software is Intangible
Cannot touch software physically.

5. Software is Complex
Large software contains many modules.

4. Software Process

Meaning:

Software process is a set of activities used to produce good software product.

Handled by:

  • Software Engineer
  • Developer

5. Main Parts of Software Process

1. Software Specification
2. Software Development
3. Software Validation
4. Software Evolution

1. Software Specification

Means planning before development.

Includes:

  • Customer requirements
  • User needs
  • Features list
  • Functionalities
  • Programming language selection
  • Database selection

This is analysis stage.

2. Software Development

Actual creation of software.

Includes:

  • Coding
  • Designing
  • Programming using language

Software is developed according to customer requirement.

3. Software Validation

Means testing software after development.

Includes:

  • Check software works properly or not
  • Check customer requirements completed or not
  • Find errors
  • Remove bugs

Done by tester.

4. Software Evolution

Means changing software after delivery.

When client requirement changes, software also changes.

Includes:

  • New features
  • Flexibility
  • Scalability
  • Updates

6. SDLC Models

Meaning:

SDLC = Software Development Life Cycle

These models are used by developers to build software.

7. Types of SDLC Models Mentioned

Model Name
Waterfall Model
RAD Model
Spiral Model
V-Model
Incremental Model
Agile Model
Iterative Model
Big Bang Model

SARIMAX

SARIMAX
SARIMAX Complete Guide

Complete SARIMAX Study Notes

1. What is SARIMAX?

SARIMAX is a forecasting model used to predict future values from past time-based data.

Full Form:

  • S = Seasonal
  • AR = AutoRegressive
  • I = Integrated
  • MA = Moving Average
  • X = Exogenous Variables

It is an advanced version of ARIMA.

Example:
Predict next month ice cream sales using:
  • Past sales
  • Summer season
  • Temperature

2. Why We Use SARIMAX?

  • ✅ Trend
  • ✅ Seasonality
  • ✅ Past dependency
  • ✅ External factors
Example:
Electricity demand depends on:
  • Past usage
  • Weather
  • Summer season

3. Simple Meaning of SARIMAX

SARIMAX predicts future data using:

  • Past values
  • Past mistakes
  • Seasonal patterns
  • External factors
Example:
Predict website traffic using:
  • Last week traffic
  • Weekend effect
  • Ad campaign

4. Components of SARIMAX

(1) Seasonal Component

Captures repeated patterns.

Every December sales increase.

(2) AutoRegressive (AR)

Current value depends on past values.

Today's stock price depends on yesterday's price.

(3) Integrated (I)

Used to remove trend using differencing.

Sales: 100, 120, 140, 160 → Trend exists.

(4) Moving Average (MA)

Uses past errors.

Yesterday error = +10, today model adjusts.

(5) Exogenous Variables (X)

External variables affect output.

Rainfall affects umbrella sales.

5. What is Seasonality?

Seasonality means repeating pattern after fixed interval.

  • Ice cream sales high every summer
  • Shopping high every Diwali

6. Why Seasonality is Important?

Ignoring seasonality gives wrong forecast.

If AC sales are predicted same in winter and summer → wrong result.

7. How to Handle Seasonality?

Use SARIMAX with seasonal parameters.

Monthly sales repeating yearly → use m = 12

8. Differencing (Integration)

Used to remove trend.

Y(t)' = Y(t) - Y(t-d)
Today sales = 200
Yesterday sales = 180

Difference = 20

9. What is Stationary Data?

Stable mean and variance over time.

100, 102, 98, 101, 99 = stationary

10. Seasonal Differencing

Used to remove seasonal effect.

This January sales = 500
Last January sales = 450

Difference = 50

11. Identify Seasonal Component

Every Sunday restaurant sales are high.

12. Trend using Moving Average

Sales = 100,120,140
Average = 120
Trend = increasing

13. Detrended Series

Original sales = 150
Trend = 120
Detrended = 30

14. Residuals

After removing trend + seasonality, remaining random values.

15. SARIMAX Parameters

(p,d,q)(P,D,Q,m)
(1,1,1)(1,1,1,12)

16. Meaning of Parameters

  • p = AR Order
  • d = Differencing
  • q = MA Order
  • P = Seasonal AR
  • D = Seasonal Differencing
  • Q = Seasonal MA
  • m = Seasonal Length
Monthly yearly data → m = 12

17. Example of m

Data Season m
Daily Weekly 7
Monthly Yearly 12
Quarterly Yearly 4

18. Example Model

SARIMAX(order=(1,1,1), seasonal_order=(1,1,1,12))
Use:
  • 1 AR term
  • 1 Differencing
  • 1 MA term
  • Yearly seasonality

19. Wrong Parameters Effect

  • Too much differencing removes useful data
  • Too few AR terms miss history
  • Too many AR terms cause overfitting
  • Wrong m gives wrong seasonal cycle

20. Advantages

  • ✅ Handles seasonality
  • ✅ Uses external variables
  • ✅ Accurate forecasts

21. Limitations

  • ❌ Needs parameter tuning
  • ❌ Slow on huge data

19/02/2026

Plotting Graphs

1️⃣ Time Series Trend Graph (Correct Way)

Python Code:

import pandas as pd
import matplotlib.pyplot as plt

df['Date'] = pd.to_datetime(df['Date'])

plt.figure(figsize=(10,5))
plt.plot(df['Date'], df['Price'])
plt.title("Time Series Trend")
plt.xlabel("Date")
plt.ylabel("Price")
plt.xticks(rotation=45)
plt.show()

Output:

✅ When We Use It:

  • When data has date / time column
  • Before applying ARIMA, SARIMA, LSTM
  • To check trend and pattern over years

๐ŸŽฏ Why We Use It:

  • To see upward/downward trend
  • To detect structural breaks
  • To identify seasonality
  • To check if data is stationary

2️⃣ Multiple Line Comparison Graph

Python Code:

plt.figure(figsize=(10,5))
plt.plot(df['Date'], df['Min_Price'], label="Min Price")
plt.plot(df['Date'], df['Max_Price'], label="Max Price")
plt.plot(df['Date'], df['Modal_Price'], label="Modal Price")
plt.legend()
plt.xticks(rotation=45)
plt.show()

Output:


✅ When We Use It:

  • When comparing Min, Max, Modal prices
  • When comparing multiple variables over time

๐ŸŽฏ Why We Use It:

  • To check spread between values
  • To analyze volatility
  • To see if variables move together
  • To detect market instability

3️⃣ Monthly Seasonal Pattern Graph

Python Code:

df['Month'] = df['Date'].dt.month
monthly_avg = df.groupby('Month')['Price'].mean()

plt.figure(figsize=(8,5))
plt.plot(monthly_avg.index, monthly_avg.values)
plt.title("Monthly Seasonal Pattern")
plt.xlabel("Month")
plt.ylabel("Average Price")
plt.show()

Output:



✅ When We Use It:

  • When data spans multiple years
  • When checking seasonality
  • Before applying SARIMA

๐ŸŽฏ Why We Use It:

  • To confirm seasonal pattern
  • To understand cyclic behavior
  • To detect repeating trends

4️⃣ Scatter Plot (Correlation Check)

Python Code:

plt.figure(figsize=(6,5))
plt.scatter(df['Min_Price'], df['Max_Price'])
plt.xlabel("Min Price")
plt.ylabel("Max Price")
plt.show()

Output:



✅ When We Use It:

  • Before Linear Regression
  • To check relationship between variables
  • For feature selection

๐ŸŽฏ Why We Use It:

  • To detect correlation
  • To check linearity
  • To detect outliers
  • To avoid useless predictors

5️⃣ Histogram (Distribution Check)

Python Code:

plt.figure(figsize=(6,5))
plt.hist(df['Price'], bins=30)
plt.xlabel("Price")
plt.ylabel("Frequency")
plt.show()

Output:



✅ When We Use It:

  • During EDA
  • Before regression modeling
  • To check normality

๐ŸŽฏ Why We Use It:

  • To check skewness
  • To detect heavy tails
  • To decide transformation

6️⃣ Box Plot (Outlier Detection)

Python Code:

plt.figure(figsize=(6,5))
plt.boxplot(df['Price'])
plt.show()

Output:



✅ When We Use It:

  • Before cleaning dataset
  • To detect extreme values
  • During preprocessing

๐ŸŽฏ Why We Use It:

  • Shows median
  • Shows IQR
  • Detects outliers visually
  • Helps cleaning decision

7️⃣ Correlation Matrix Heatmap

Python Code:

import seaborn as sns

plt.figure(figsize=(8,6))
corr = df.corr()
sns.heatmap(corr, annot=True)
plt.show()

Output:


✅ When We Use It:

  • Before ML modeling
  • When multiple numeric features exist
  • For feature selection

๐ŸŽฏ Why We Use It:

  • To detect multicollinearity
  • To remove redundant variables
  • To choose best predictors
  • To improve model accuracy

1️⃣ LINE PLOT (Most Important for You)

Python Code

plt.figure(figsize=(10,5))

plt.plot(data["Month"],
         data["Modal Price (Rs./Quintal)"],
         marker="o",
         linewidth=2)

plt.xlabel("Month")
plt.ylabel("Modal Price")
plt.title("Monthly Modal Price Trend")

plt.show()

Explanation of Each Function

Function What It Does
plt.figure(figsize=(10,5)) Sets graph size
plt.plot(x,y) Creates line graph
marker="o" Adds circle dots
linewidth=2 Controls line thickness
plt.xlabel() Name of X-axis
plt.ylabel() Name of Y-axis
plt.title() Graph title
plt.show() Displays graph

When We Use Line Plot?

  • Time series data
  • Monthly/Yearly crop prices
  • Trend analysis
  • Forecasting models (ARIMA, LSTM)
Because your crop price dataset is time-based. It helps to see trends, seasonality, and price movement.

2️⃣ HISTOGRAM

Python Code

plt.hist(data["Modal Price (Rs./Quintal)"],
         bins=10,
         edgecolor="black")

plt.xlabel("Modal Price")
plt.ylabel("Frequency")
plt.title("Distribution of Modal Price")

plt.show()

Explanation

Function What It Does
plt.hist() Creates histogram
bins=10 Divides data into 10 ranges
edgecolor Border color of bars

When We Use Histogram?

  • To check data distribution
  • Before normalization
  • Before regression models
  • To detect skewness
Histogram helps check if data is normally distributed and whether outliers exist.

3️⃣ SCATTER PLOT

Python Code

plt.scatter(data["Min Price (Rs./Quintal)"],
            data["Max Price (Rs./Quintal)"])

plt.xlabel("Min Price")
plt.ylabel("Max Price")
plt.title("Min vs Max Price Relationship")

plt.show()

When We Use Scatter Plot?

  • Check relationship between two variables
  • Before regression
  • Correlation checking
If points move upward → Positive correlation If downward → Negative correlation If random → No strong relation

4️⃣ BAR GRAPH

Python Code

yearly_avg = data.groupby("Year")["Modal Price (Rs./Quintal)"].mean()

plt.bar(yearly_avg.index,
        yearly_avg.values)

plt.xlabel("Year")
plt.ylabel("Average Modal Price")
plt.title("Yearly Average Modal Price")

plt.show()

When We Use Bar Graph?

  • Compare categories
  • Compare yearly performance
  • Compare ML model metrics
Bar graph is best for comparison of categories and yearly averages.

5️⃣ STACK / AREA PLOT

Python Code

plt.stackplot(data["Month"],
              data["Min Price (Rs./Quintal)"],
              data["Modal Price (Rs./Quintal)"],
              data["Max Price (Rs./Quintal)"],
              labels=["Min", "Modal", "Max"])

plt.xlabel("Month")
plt.ylabel("Price")
plt.title("Price Comparison Area Plot")
plt.legend()

plt.show()

When We Use It?

  • Compare contribution
  • Show composition of multiple variables
Shows which price component dominates over time.

6️⃣ PIE CHART

Python Code

avg_prices = [
    data["Min Price (Rs./Quintal)"].mean(),
    data["Modal Price (Rs./Quintal)"].mean(),
    data["Max Price (Rs./Quintal)"].mean()
]

labels = ["Min", "Modal", "Max"]

plt.pie(avg_prices,
        labels=labels,
        autopct="%1.1f%%")

plt.title("Average Price Contribution")

plt.show()

When We Use It?

  • To show percentage distribution
  • For presentation
  • For reports
Pie chart shows percentage contribution of each price component.