Storing Data in Database using Python
In most of the data science projects, we need to deal with the database. We need to store values in the database. I was working with one of the use cases where I need to store the data in a database like MS SQL, MySQL, Oracle DB.
Scenario
I had to read the database from CSV/excel file and I need to analyze the dataset and perform some analysis on the dataset and need to store the values in the database.
In your case, the data sources may be from any data sources like MySQL, SQL, etc.
Dataset
I have a dataset named data.csv.The dataset has two columns value1 and values. I need to store these values into the database. Here, I am using a MySQL database to store the data. You can use any databases. It depends on your requirements.
Libraries Installation
You need to install the necessary libraries. For the MySQL database, you need to install MySQL.connector.You can install the libraries by typing the following command:
pip install mysql-connector
Creating a database, table
We must have installed a database like MySQL, SQL, etc in our local system so that we can create a database and tables in the database. I am using MySQL Workbench.I have used the following simple queries to create database and table:
create database mediumdb;
use mediumdb;
CREATE TABLE test (
value1 int,value2 int
);
show tables;
In my case database name is mediumdb and test is the table name.
Now your databse and table is created in your local system.Now we are going to see python code.#importing necessary libraries
import mysql.connector
from mysql.connector import Error
import time
import matplotlib.pyplot as plt
import numpy as np
import time
import pandas as pd
import sqlalchemy#reading datadata=pd.read_csv("/home/sakil/Desktop/Medium/data.csv")
data.head()
database_username = 'root' #database username,you need to give of yours
database_password = 'root' #database password
database_ip = 'localhost'
database_name = 'mediumdb' #database name
database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.
format(database_username, database_password,
database_ip, database_name)).connect()
data.to_sql(con=database_connection, name='test', if_exists='replace',chunksize=100000)
database_connection.close()
print("data is inserted successfuly")
#data is the name of your dataframe
#name='test',here test is the name of the table you need to give the table name which you have created in the database
#in your local systemdata is inserted successfuly
if_exists: {‘fail’, ‘replace’, ‘append’}, default ‘fail’ How to behave if the table already exists.
* fail: Raise a ValueError.
* replace: Drop the table before inserting new values.
* append: Insert new values to the existing table.You can know more about the function signature by pressing Shift+Tab on specofied function.
Data is inserted successfully
Data is successfully inserted into your database.I have attached the screenshot of my database.
from IPython.display import Image
Image(filename='database1.png',width=800, height=400)
Final Thought
If you are working as a data scientist, it is very important to know how to integrate your python code to databases. In most cases, we need to store the data into the databases.I just wanted to share my experience with the data science community as it may help others.