















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Data Warehouse is electronic storage of a large amount of information by a business which is designed for query and analysis instead of transaction processing. It is a process of transforming data into information and making it available to users for analysis.
Typology: Study notes
1 / 23
This page cannot be seen from the preview
Don't miss anything!
i PROJECT BASED LAB REPORT On
Submitted in partial fulfilment of the Requirements for the award of the Degree of Bachelor of Technology In Computer science and Engineering Under the esteemed guidance of (Dr. K. Bhanu Prakash) (Professor) By STUDENT ID STUDENT NAME 170030104 Bandi. Sandeep Reddy 170031008 P B N Anusha (DST-FIST Sponsored Department) K L EDUCATION FOUNDATION Green Fields, Vaddeswaram, Guntur District-522 502 2019-
ii TABLE OF CONTENTS CHAPTERS PAGE NO ABSTRACT 1 CHAPTER 1: INTRODUCTION 2 1.1 INTRODUCTION 2 1.2 PROBLEM DEFINITION 2 1.3 SCOPE 3 1.4 PURPOSE 3 1.5 PROBLEM AND EXISTING TECHNOLOGY 4 1.6 PROPOSED SYSTEM 4 CHAPTER 2: REQIUREMENTS & ANALYSIS 5 2.1 PLATFORM REQUIREMENTS 5 2.2 MODULE DESCRIPTION 5 CHAPTER 3: DESIGN & IMPLEMENTATION 12 3.1 ALGORITHMS 12 3.2 PSEUDO CODE 12 CHAPTER 4: SCREENSHOTS 19 CHAPTER 5: CONCLUSION 21 CHAPTER 6: REFERENCES 21
Page 2
Data Warehousing Data Warehouse is electronic storage of a large amount of information by a business which is designed for query and analysis instead of transaction processing. It is a process of transforming data into information and making it available to users for analysis. Data Mining Data mining is looking for hidden, valid, and potentially useful patterns in huge data sets. Data Mining is all about discovering unsuspected/ previously unknown relationships amongst the data.It is a multi-disciplinary skill that uses machine learning, statistics, AI and database technology. 1.1. Introduction Rainfall Prediction is the application of science and technology to predict the amount of rainfall over a region. It is important to exactly determine the rainfall for effective use of water resources, crop productivity and pre-planning of water structures. In this project, we used Linear Regression to predict the amount of rainfall. Linear Regression tells us how many inches of rainfall we can expect. 1.2 Problem Definition It is important to exactly determine the rainfall for effective use of water resources, crop productivity and pre-planning of water structures.
Page 3 1.3 Scope It tells us how many inches of rainfall we can expect. 1.4 Purpose There are several reasons why weather forecasts are important. They would certainly be missed if they were not there. It is a product of science that impacts the lives of many people. The following is a list of various reasons why weather forecasts are important:
Page 5
2.1. Platform Requirements Hardwar e/ Software Hardware / Software element Specification /version Hardwar e Processor i RAM 2GB Hard Disk 250GB Software OS Windows,Linux. Jupyter NoteBook. Python 3. Python IDE Microsoft Azure 2.2. Modules Description In this project we have Two modules
Page 6
Page 8 Linear regression performs the task to predict a dependent variable value (y) based on a given independent variable (x). So, this regression technique finds out a linear relationship between x (input) and y(output). Hence, the name is Linear Regression. In the figure above, X (input) is the work experience and Y (output) is the salary of a person. The regression line is the best fit line for our model.
Page 9 Hypothesis function for Linear Regression : y=mx+c Where y is the response variable. x is the predictor variable. m and c are constants which are called the coefficients. 2.3. Data Set The dataset is a public weather dataset from Austin, Texas available on Kaggle. austin_weather.csv Columns: Date- The date of the collection (YYYY-MM-DD) TempHighF- High temperature, in degrees Fahrenheit TempAvgF- Average temperature, in degrees Fahrenheit TempLowF- Low temperature, in degrees Fahrenheit DewPointHighF-
Page 11 Average visibility, in miles VisibilityLowMiles- Low visibility, in miles WindHighMPH- High wind speed, in miles per hour WindAvgMPH- Average wind speed, in miles per hour WindGustMPH- Highest wind speed gust, in miles per hour PrecipitationSumInches- Total precipitation, in inches ('T' if trace) Events- Adverse weather events (' ' if None)
Page 12
3.1 Algorithms: Linear Regression: Module-1 : Data gathering and pre - processing. Module-2: Applying Algorithm for prediction. 3.2Source Code
import pandas as pd import numpy as np import matplotlib.pyplot as plt
data = pd.read_csv("C:/Users/TEMP.SANDEEP/Desktop/austin_weather.csv") #seeing head values data.head(5) #seeing shape of the dataset data.shape
Page 14 plt.show() #basic static
data.to_csv('C:/Users/TEMP.SANDEEP/Desktop/austin_final_final.csv')
import pandas as pd import numpy as np import sklearn as sk from sklearn.linear_model import LinearRegression import matplotlib.pyplot as plt
data = pd.read_csv("C:/Users/TEMP.SANDEEP/Desktop/austin_final_final.csv")
X = data.drop(['PrecipitationSumInches'], axis = 1)
Y = data['PrecipitationSumInches']
Y = Y.values.reshape(-1, 1)
Page 15
day_index = 798 days = [i for i in range(Y.size)]
clf = LinearRegression()
clf.fit(X, Y)
inp = np.array([[74], [60], [45], [67], [49], [43], [33], [45], [57], [29.68], [10], [7], [2], [0], [20], [4], [31]]) inp = inp.reshape(1, -1)
print('The precipitation in inches for the input is:', clf.predict(inp))
Page 17 plt.scatter(days[day_index], x_vis[x_vis.columns.values[i]][day_index], color ='r') plt.title(x_vis.columns.values[i]) plt.show() OUTPUT: The precipitation in inches for the input is: [[1.33868402]] Graphs:
Page 18 2)The precipitation trend graph: