Cara menggunakan connect sqlite to excel

Businesses use a wide array of accounting applications to manage their finances. Some businesses use commercial accounting or financial applications while others create in-house programs using tools such as Microsoft Access or Visual Basic. While accounting application features vary considerably, one thing that virtually all of them share in common is the use of a back-end database to index and store data.

Many custom accounting applications use the open source SQLite engine to manage data. If your business accounting program connects to an SQLite database, you might want to export data from the application for analysis and use in Microsoft Excel. If the application does not have a built-in control to export data and create spreadsheets, you can use tools included in Excel to retrieve datasets from an SQLite database and insert into your workbooks quickly and efficiently.

  1. 1.

    Download the SQLite ODBC driver from the Ch-Werner.de website (link in Resources). Install the driver on your computer.

  2. 2.

    Open Microsoft Excel. Create a new worksheet or open the one to which you want to add data from the SQLite database.

  3. 3.

    Click “Data” on the Excel ribbon menu. Click the “From Other Sources” button above the “Get External Data” section label. After the drop-down list appears, click “From Microsoft Query.”

  4. 4.

    Disable the “Use the Query Wizard to Create/Edit” Queries” option. Click the “New Data Source” option, and then click “OK.”

  5. 5.

    Click and enable the “SQLite3 ODBC Driver” option. Click “OK.” A new query window opens and displays the tables from your SQLite database. Highlight and select the table containing the data that you want to import into Excel, then click “Close.”

  6. 6.

    Click the “*” symbol in the Fields list. Next, click the “>” button in the middle of the window to import all of the fields from the table into Excel.

  7. 7.

    Select desired filter options for the data. Filters instruct Excel to retrieve only data from the SQLite data table that meets specified conditions. For example, if your SQLite data table contains a list of customers and their addresses or contact information, the table probably has a field labeled “fldTelephone” or something similar. If you want only to retrieve data for customers who have a "211" area code, for example, you can do so by applying a filter. Click the “fldTelephone” or similarly named field in the “Column to Filter” pane. Click the “Contains” option in the filter list. Enter “211” in the field next to the “Contains” option and click “Next.”

  8. 8.

    Click the “Next” button to sort the records in descending order, which is the default, or enable the ascending sort order option and click “Next.” Click and enable the “Return Data to Microsoft Excel” option. Click the “Finish” button to close the new query window.

  9. 9.

    Click and enable the “Table” and “Existing Worksheet” options in the Import Data window. Click an empty cell on the Excel spreadsheet where you want the data table from the SQLite database to appear. Click the “OK” button. Excel creates and displays a new table with retrieved records from the SQLite data table based on any filters you used in the query.

    Ceritanya tadi ada task untuk mengolah data sebuah file excel berisi 22970 baris data transfer-transferan. Inti tugasnya adalah menampilkan seluruh pengirim (

    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    0), penerima (
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    1), dan transaksi (
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    2) yang duplicated.

    Sebenarnya gampang, ya, tinggal klik-klik langsung filter di excel. Tapi data sebanyak 22970 ini cukup untuk membuat komputer gw ngos-ngosan.

    Terus kepikiran pengen buat macro di VBA Excel, nanti diquery pakai sql buat filternya. Tapi udah lupa cara bikin macro. Udah aja 1 jam berkutat belajar lagi macro terus menyerah pas gagal bikin koneksinya pakai ADODB (gak bisa create object kalau pakai macOS, ceunah).

    Tiba-tiba. Oh, kenapa gak pakai Python aja kalau mau diperlakukan sebagai sql? Kayaknya lebih cepat. Tapi sebenarnya gak cepat juga, karena gw pengennya data itu jadi tabel dalam sebuah db yang nantinya akan gw query dengan sql (entah mengapa ngotot banget pengen pake sql query). Sementara gw belum tau caranya convert excel ke db. Jadilah lama lagi nyari-nyari caranya. Tambah lagi udah buang waktu buat mencoba bikin macro. Kayaknya lebih cepat kalau bikin filter langsung di excel dari tadi, wkwkwk.

    Data

    File excel-nya bisa didownload di sini. Catatan, data sudah dimodifikasi, baik isi maupun jumlah field, untuk kepentingan pembelajaran semata.

    Dataset terdiri dari 22970 baris dan 5 kolom (

    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    2,
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    4,
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    5,
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    0, dan
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    1).

    Pertama, seperti biasa import dulu library yang akan digunakan.

    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    8 untuk membuat DataFrame,
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    9 untuk query sql, dan
    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    0 untuk mengexport data dari excel ke db.

    import pandas as pd
    import sqlite3 as sq
    
    # untuk export excel ke db
    from sqlalchemy import create_engine
    

    Kemudian, baca excel dan simpan di DataFame

    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    1. Terus check shape hasilnya
    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    2.

    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    

    Export dataframe ke DB

    Pertama kita siapkan koneksi dulu. Database yang akan dibuat bernama

    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    3.

    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    

    Setelah itu export Dataframe ke tabel bernama

    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    4. Di sini kita juga melakukan pengecekan, jika tabelnya sudah ada maka direplace. Bisa juga diganti dengan
    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    5 kalau tidak mau
    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    6 dengan db yang baru. Cek dokumentasinya di sini.

    # export df to sqlite db, if exists then replace
    sqlite_table = "mtpg"
    df.to_sql(sqlite_table, sqlite_connection, if_exists='replace')
    

    Proses di atas mengexport seluruh data dalam dataframe ke sebuah tabel. Jadi tidak perlu lagi mengcopy datanya satu-persatu. Setelah export selesai, tutup koneksinya.

    sqlite_connection.close()
    

    Query menggunakan SQL

    Kita coba query sederhana dulu dengan memilih 5 record pertama. Inget, koneksinya ke db tapi pas select tetap ke tabel.

    # check creted db by selecting 5 records
    conn = sq.connect("mtpg_report.db")
    df_mtpg = pd.read_sql_query ("select * from mtpg limit 5;", conn)
    df_mtpg
    

    Jika tidak ada masalah maka outputnya adalah 5 baris pertama dari data. Kalau sudah OK, kita bisa mulai query yang sesungguhnya.

    Yang pertama adalah me-list semua

    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    0 yang duplicated. Misal kita ingin mencari
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    0 yang duplicated dari
    # preparing connection to db
    engine = create_engine('sqlite:///mtpg_report.db', echo=True)
    sqlite_connection = engine.connect()
    
    9 dan
    # export df to sqlite db, if exists then replace
    sqlite_table = "mtpg"
    df.to_sql(sqlite_table, sqlite_connection, if_exists='replace')
    
    0. Untuk menampilkan semua
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    0 yang duplicated, bisa dilakukan dengan
    # export df to sqlite db, if exists then replace
    sqlite_table = "mtpg"
    df.to_sql(sqlite_table, sqlite_connection, if_exists='replace')
    
    2 diikuti dengan select
    # export df to sqlite db, if exists then replace
    sqlite_table = "mtpg"
    df.to_sql(sqlite_table, sqlite_connection, if_exists='replace')
    
    3 yang lebih dari 1.

    #  sender id duplicated
    str_query = "SELECT * \
                FROM mtpg \
                    WHERE (\"Sender ID\") IN \
                    (SELECT \"Sender ID\" \
                        FROM mtpg WHERE \"Merchant Name\" = \"WAL\" \
                        AND \"Transaction Type\" = \"Transfer OUT\" \
                        GROUP BY \"Sender ID\" \
                        HAVING COUNT (*)>1 \
                        )\
                ORDER BY \"Sender ID\" DESC"
    df_mtpg = pd.read_sql_query (str_query, conn)
    df_mtpg
    

    Contoh ke dua masih seperti yang pertama, tapi kali ini kita tambahkan kriterianya. Selain

    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    0 yang duplicated, kita juga ingin melist
    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    1 yang duplicated.

    # beneficiary id, dan sender id duplicated
    str_query = "SELECT * \
                FROM mtpg \
                    WHERE (\"Beneficiary ID\") IN \
                    (SELECT \"Beneficiary ID\" \
                     FROM mtpg WHERE \"Sender ID\" IN  \
                        (SELECT \"Sender ID\" \
                            FROM mtpg WHERE \"Merchant Name\" = \"WAL\" \
                            AND \"Transaction Type\" = \"Transfer OUT\" \
                            GROUP BY \"Sender ID\" \
                            HAVING COUNT (*)>1 \
                        )\
                    GROUP BY \"Beneficiary ID\" \
                    HAVING COUNT (*)>1 \
                    )\
                ORDER BY \"Sender ID\" DESC"
    df_mtpg = pd.read_sql_query (str_query, conn)
    df_mtpg
    

    Contoh ke-tiga, tambah lagi dengan kriteria

    # read excel, save to a dataframe
    df = pd.read_excel('mtpg_report.xlsx',header=0, index_col=0)
    
    # check shape
    df.shape
    
    2 yang duplicated.

    # invoice no, beneficiary id, dan sender id duplicated
    
    str_query = "SELECT * \
                FROM mtpg \
                    WHERE (\"Invoice Number\") IN \
                    (SELECT \"Invoice Number\" \
                        FROM mtpg WHERE \"Beneficiary ID\" IN \
                        (SELECT \"Beneficiary ID\" \
                         FROM mtpg WHERE \"Sender ID\" IN  \
                            (SELECT \"Sender ID\" \
                                FROM mtpg WHERE \"Merchant Name\" = \"WAL\" \
                                AND \"Transaction Type\" = \"Transfer OUT\" \
                                GROUP BY \"Sender ID\" \
                                HAVING COUNT (*)>1 \
                            )\
                        GROUP BY \"Beneficiary ID\" \
                        HAVING COUNT (*)>1 \
                        )\
                    GROUP BY \"Invoice Number\" \
                    HAVING COUNT (*)>1) \
                    ORDER BY \"Sender ID\" DESC"
    df_mtpg = pd.read_sql_query (str_query, conn)
    df_mtpg
    

    Terakhir, simpan hasilnya di file excel.

    # save result to an excel file
    df_mtpg.to_excel("mtpg_filter_duplicated.xlsx")
    

    That’s all. Kalau ada saran untuk meringkas query-nya, boleh di komen, hehe. Jupyter notebook untuk latihan ini bisa didownload di sini. Thank you.