Storing Data in Database using Python

Sakil Ansari
3 min readJul 2, 2020

--

In most of the data science projects, we need to deal with the database. We need to store values in the database. I was working with one of the use cases where I need to store the data in a database like MS SQL, MySQL, Oracle DB.

Scenario

I had to read the database from CSV/excel file and I need to analyze the dataset and perform some analysis on the dataset and need to store the values in the database.

In your case, the data sources may be from any data sources like MySQL, SQL, etc.

Dataset

I have a dataset named data.csv.The dataset has two columns value1 and values. I need to store these values into the database. Here, I am using a MySQL database to store the data. You can use any databases. It depends on your requirements.

Libraries Installation

You need to install the necessary libraries. For the MySQL database, you need to install MySQL.connector.You can install the libraries by typing the following command:

pip install mysql-connector

Creating a database, table

We must have installed a database like MySQL, SQL, etc in our local system so that we can create a database and tables in the database. I am using MySQL Workbench.I have used the following simple queries to create database and table:

create database mediumdb;

use mediumdb;

CREATE TABLE test (

value1 int,value2 int

);

show tables;

In my case database name is mediumdb and test is the table name.

Now your databse and table is created in your local system.Now we are going to see python code.#importing necessary libraries
import mysql.connector
from mysql.connector import Error
import time
import matplotlib.pyplot as plt
import numpy as np
import time
import pandas as pd
import sqlalchemy
#reading datadata=pd.read_csv("/home/sakil/Desktop/Medium/data.csv")
data.head()
png
database_username = 'root'  #database username,you need to give of yours 
database_password = 'root' #database password
database_ip = 'localhost'
database_name = 'mediumdb' #database name
database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.
format(database_username, database_password,
database_ip, database_name)).connect()
data.to_sql(con=database_connection, name='test', if_exists='replace',chunksize=100000)
database_connection.close()
print("data is inserted successfuly")
#data is the name of your dataframe
#name='test',here test is the name of the table you need to give the table name which you have created in the database
#in your local system
data is inserted successfuly

if_exists: {‘fail’, ‘replace’, ‘append’}, default ‘fail’ How to behave if the table already exists.

* fail: Raise a ValueError.
* replace: Drop the table before inserting new values.
* append: Insert new values to the existing table.
You can know more about the function signature by pressing Shift+Tab on specofied function.

Data is inserted successfully

Data is successfully inserted into your database.I have attached the screenshot of my database.

from IPython.display import Image
Image(filename='database1.png',width=800, height=400)
png

Final Thought

If you are working as a data scientist, it is very important to know how to integrate your python code to databases. In most cases, we need to store the data into the databases.I just wanted to share my experience with the data science community as it may help others.

--

--

Sakil Ansari
Sakil Ansari

Written by Sakil Ansari

Working as a Data Scientist/ML/NLP/Speech Recognition/Deep learning

No responses yet