Installation and Configuration of Fake2DB Tool for Auto Generate Fake but Valid Data

3 years ago SETHA THAY 2483
Installation and Configuration of Fake2DB Tool for Auto Generate Fake but Valid Data

Data is way more expensive these days since we are in the modern digital era. Having data on your hand, you can perform experiments with various ways to explore the performance of servers, well prepare for future improvement and visualize data as needed. Example: Facebook is using user data to help the business owners to target their customers and improve marketing strategy as well as sale performance. In order to apply any technique to production, scientists have to conduct experiments of the proposal in a dataset which sometimes can take production data to conduct and sometimes have to mock the data by themself. An easy tool to auto-generate the mock data must be needed to execute the experiment.

There are many tools available on the market. In this post, we are going to introduce a tool called "fake2db" which work very well with a various database such as PostgreSQL, MongoDB, MySQL, Redis, and CouchDB. Fake2DB can generate fake but valid data for test purposes using the most popular patterns (AFAIK). You will learn about installation and configuration, usage, and further notes of Fake2DB.

INSTALLATION

Prerequisites: Before we begin to install and configure tool fake2db, we have to install PostgreSQL as mentioned in the post of Installing PostgreSQL 13 from Source in Ubuntu 20.04 then follow the instruction below to start the installation of fake2db.

In the first step, we have to install a pip (Python Installer Package) package management system. by default, ubuntu 20.04 is built with pre-installed python 3 and you can check in the terminal using

python3 --version

Next, using the below command to install Pip, a package management system for python

sudo apt install python3-pip

pip3 --version //To check version of pip

Next, Installing Fake2DB inside package management (Pip)

pip install fake2db

Collecting fake2db
  Downloading fake2db-0.5.4.tar.gz (10 kB)
Collecting Faker==0.7.11
  Downloading Faker-0.7.11-py2.py3-none-any.whl (579 kB)
Collecting python-dateutil>=2.4
  Downloading python_dateutil-2.8.2-py2.py3-none-any.whl (247 kB)
Requirement already satisfied: six in /usr/lib/python3/dist-packages (from Faker==0.7.11->fake2db) (1.14.0)
Building wheels for collected packages: fake2db
  Building wheel for fake2db (setup.py) ... done
  Created wheel for fake2db: filename=fake2db-0.5.4-py3-none-any.whl size=17157 sha256=f73072fb3a803440c68ae03145d7a4c50e1cb052b6c7d619bde773481dcf0c52
  Stored in directory: /home/ubuntu/.cache/pip/wheels/0b/ca/cc/b016e22cc271ee0f5ce223522a1ee033191ab00d57fe1383b7
Successfully built fake2db
Installing collected packages: python-dateutil, Faker, fake2db
  WARNING: The script faker is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
  WARNING: The script fake2db is installed in '/home/ubuntu/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
Successfully installed Faker-0.7.11 fake2db-0.5.4 python-dateutil-2.8.2

We can remove the warning above by adding PATH into the profile

echo "export PATH=\"/home/ubuntu/.local/bin:\$PATH\"" >> ~/.bashrc && source ~/.bashrc

Next, Installing postgresql-devel if not exist

sudo apt install postgresql-devel

Next, Installing psycopg2

sudo apt install psycopg2-binary

* Finally, checking if tool fake2db is properly installed on the server

fake2db --help

usage: fake2db [-h] [--rows ROWS] [--db DB] [--name NAME] [--host HOST] [--port PORT] [--username USERNAME] [--password PASSWORD] [--custom CUSTOM [CUSTOM ...]] [--locale LOCALE] [--seed SEED]

optional arguments:
  -h, --help            show this help message and exit
  --rows ROWS           Amount of rows desired per table
  --db DB               Db type for creation: sqlite, mysql, postgresql, mongodb, redis, couchdb, to be expanded
  --name NAME           The name to the db to be generated
  --host HOST           Hostname of db
  --port PORT           Port of db
  --username USERNAME   Username
  --password PASSWORD   Password
  --custom CUSTOM [CUSTOM ...]
                        Custom schema for db generation, supports functions that fake-factory provides, see fake2db github repository for options https://github.com/emirozer/fake2db
  --locale LOCALE       The locale of the data to be generated: {bg_BG,cs_CZ,...,zh_CN,zh_TW}. 'en_US' as default
  --seed SEED           Seed value for the random generator

USAGE

Example 1: Generate 2500 records

$ fake2db --db postgresql --rows 2500 --host localhost --password pwd --user postgres

2021-08-13 03:26:44,845 ubuntu   Rows argument : 2500
2021-08-13 03:26:45,412 ubuntu   Database created and opened succesfully: postgresql_coyygykv
2021-08-13 03:26:46,846 ubuntu   simple_registration Commits are successful after write job!
2021-08-13 03:26:50,792 ubuntu   detailed_registration Commits are successful after write job!
2021-08-13 03:26:56,838 ubuntu   companies Commits are successful after write job!
2021-08-13 03:26:57,215 ubuntu   user_agent Commits are successful after write job!
2021-08-13 03:27:01,936 ubuntu   customer Commits are successful after write job!

In the above command, fake2db will create a new database with the random name "postgresql_coyygykv" and create 4 new tables simple_registration, detailed_registration, companies, user_agentand customer with 2500 rows in each table.

installation-and-configuration-of-fake2db-tool-for-auto-generate-fake-but-valid-data-example1

Example 2: Generate 250 rows with a defined database named postgresql12345 and a table named custom with 3 columns (name, date, country)

fake2db --rows 250 --db postgresql --user postgres --password pwd --host localhost --name postgresql123456 --custom name date country

2021-08-13 05:16:20,056 ubuntu   Rows argument : 250
2021-08-13 05:16:20,586 ubuntu   Database created and opened succesfully: postgresql123456
2021-08-13 05:16:20,587 ubuntu   fake2db found valid custom key provided: name
2021-08-13 05:16:20,587 ubuntu   fake2db found valid custom key provided: date
2021-08-13 05:16:20,587 ubuntu   fake2db found valid custom key provided: country
2021-08-13 05:16:20,817 ubuntu   custom Commits are successful after write job!

Example 3: Define more columns for Fake2DB to generate data for us

fake2db --rows 250 --db postgresql --user postgres --password pwd --host localhost --name postgresql123457 --custom name date country currency_code credit_card_full credit_card_provider

2021-08-13 06:26:24,018 ubuntu   Rows argument : 250
2021-08-13 06:26:24,174 ubuntu   Database created and opened succesfully: postgresql123457
2021-08-13 06:26:24,175 ubuntu   fake2db found valid custom key provided: name
2021-08-13 06:26:24,175 ubuntu   fake2db found valid custom key provided: date
2021-08-13 06:26:24,175 ubuntu   fake2db found valid custom key provided: country
2021-08-13 06:26:24,176 ubuntu   fake2db found valid custom key provided: currency_code
2021-08-13 06:26:24,176 ubuntu   fake2db found valid custom key provided: credit_card_full
2021-08-13 06:26:24,176 ubuntu   fake2db found valid custom key provided: credit_card_provider
2021-08-13 06:26:24,553 ubuntu   custom Commits are successful after write job!

Example 4: Define some more columns :)

fake2db --rows 250 --db postgresql --user postgres --password pwd --host localhost --name postgresql123458 --custom name date country currency_code credit_card_full credit_card_provider postalcode date_time_ad day_of_week

2021-08-13 06:28:22,386 ubuntu   Rows argument : 250
2021-08-13 06:28:22,556 ubuntu   Database created and opened succesfully: postgresql123458
2021-08-13 06:28:22,556 ubuntu   fake2db found valid custom key provided: name
2021-08-13 06:28:22,556 ubuntu   fake2db found valid custom key provided: date
2021-08-13 06:28:22,556 ubuntu   fake2db found valid custom key provided: country
2021-08-13 06:28:22,558 ubuntu   fake2db found valid custom key provided: currency_code
2021-08-13 06:28:22,559 ubuntu   fake2db found valid custom key provided: credit_card_full
2021-08-13 06:28:22,560 ubuntu   fake2db found valid custom key provided: credit_card_provider
2021-08-13 06:28:22,560 ubuntu   fake2db found valid custom key provided: postalcode
2021-08-13 06:28:22,561 ubuntu   fake2db found valid custom key provided: date_time_ad
2021-08-13 06:28:22,561 ubuntu   fake2db found valid custom key provided: day_of_week
2021-08-13 06:28:22,959 ubuntu   custom Commits are successful after write job!

As you can see in the above examples, we can add more columns for the table we want to make fake valid data. These columns are pre-defined by the fake2db team and you can check more in their GitHub repoMoreover, if you are a python developer you can extend to add more custom columns as needed. You can also check how to integrate with other databases and examples of how to custom database data generation as well.

NOTES

  • PIP is a standard package management system used to install and manage software written in Python. Most distributions of Python come with pip pre-installed
  • postgresql-devel package contains the header files and libraries needed to compile C or C++ applications which will directly interact with a PostgreSQL database management server and the ecpg Embedded C Postgres preprocessor. You need to install this package if you want to develop applications which will interact with a PostgreSQL server. If you're installing postgresqlserver, you need to install this package.
  • psycopg2 is the most popular PostgreSQL database adapter for the Python programming language. Psycopg2 is mostly implemented in C as a libpq wrapper, resulting in being both efficient and secure.

 

THANK YOU!!!

Find Us @
Facebook
Telegram
Twitter
LinkedIn


About author

Author Profile

SETHA THAY

Software Engineer & Project Manager. I am willing to share IT knowledge, technical experiences and investment to financial freedom. Feel free to ask and contact me.



Scroll to Top