File Corrupted.. sqlite3.DatabaseError: database disk image is malformed..

sebastian borjas 0 Reputation points
2023-04-30T18:37:29.1166667+00:00

I have a datastore i made and was running perfectly. I made a new one uploaded a modified version of the database and it gives me this error .. i have this part of my code where the error is caused.. i am not sure why that happens because the same version of code with different database paths works but this one shows the information for the database but then it gives error when trying to read it. files 1. version that works start 2. version that doesn't 3. error from version 2.. Thanks I tried uploading it again and pointing the new upload same error. When I download file I can view it in database viewer so I don't know why I get error....

File Corrupted.. sqlite3.DatabaseError: database disk image is malformed..

  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 920, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DatabaseError: (sqlite3.DatabaseError) database disk image is malformed
[SQL: SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite~_%' ESCAPE '~' ORDER BY name]
(Background on this error at: https://sqlalche.me/e/20/4xp6)
import numpy as np
import pandas as pd
import tensorflow as tf
import os
import sqlite3
import csv
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import classification_report, precision_recall_fscore_support, accuracy_score
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Input, LSTM, Activation, Dropout, Reshape, Conv1D, MaxPooling1D, Dense, Flatten
from tensorflow.keras.models import Model

import matplotlib.pyplot as plt
from azureml.core import Workspace, Datastore, Dataset
from azure.storage.blob import BlobServiceClient
from sqlalchemy import create_engine
from sqlalchemy import inspect
from sqlalchemy import MetaData

# Connect to your workspace
subscription_id = 'b0132afa-d29b-4d66-88f8-2c189dcba3be'
resource_group = 'resource-one'
workspace_name = 'workspace-one'

workspace = Workspace(subscription_id, resource_group, workspace_name)

# Get the datastore
datastore_name = 'workspaceblobstore'
datastore = Datastore.get(workspace, datastore_name)

# Retrieve datastore account name and key
account_name = datastore.account_name
account_key = datastore.account_key

# Create a BlobServiceClient using the storage account key
connection_string = f"DefaultEndpointsProtocol=https;AccountName={account_name};AccountKey={account_key};EndpointSuffix=core.windows.net"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)

# Download the SQLite database file to your local environment
local_file_name = "qqqq.db"
datastore.download(target_path=os.getcwd(), prefix='UI/2023-04-27_225942_UTC/qqqq.db', overwrite=True, show_progress=True)

# Get a reference to the container
container_name = 'azureml-blobstore-94c291eb-0707-4d58-8ab8-33b5c7edb276'
container_client = blob_service_client.get_container_client(container_name)

# Download a specific file (replace 'file_path' with the path of the file you want to download)
file_path = 'UI/2023-04-27_225942_UTC/qqqq.db'
local_file_name = 'local_qqqq.db'

# List blobs in the container
blobs = container_client.list_blobs()

for blob in blobs:
    print("Name:", blob.name)
    print("Size:", blob.size)

# Connect to the downloaded SQLite database file
engine = create_engine(f'sqlite:///{local_file_name}')
# Query all the tables and store them in a dictionary with table names as keys and dataframes as values
tables = {}
inspector = inspect(engine)
table_names = inspector.get_table_names()
print(f"Table names: {table_names}")
# Connect to the downloaded SQLite database file
conn = sqlite3.connect(local_file_name)

for table_name in table_names:
    tables[table_name] = pd.read_sql_query(f"SELECT * FROM {table_name}", conn)

subscription_id = 'b0132afa-d29b-4d66-88f8-2c189dcba3be'
resource_group = 'resource-one'
workspace_name = 'workspace-one'
  
workspace = Workspace(subscription_id, resource_group, workspace_name)
# Get the datastore
datastore_name = 'workspaceblobstore'
datastore = Datastore.get(workspace, datastore_name)
# datastore = Datastore.get(workspace, "workspaceblobstore")

# Retrieve datastore account name and key
account_name = datastore.account_name
account_key = datastore.account_key

# Create a BlobServiceClient using the storage account key
connection_string = f"DefaultEndpointsProtocol=https;AccountName={account_name};AccountKey={account_key};EndpointSuffix=core.windows.net"
blob_service_client = BlobServiceClient.from_connection_string(connection_string)

# Download the SQLite database file to your local environment
local_file_name = "twodata.db"
print("downloading datastore")
datastore.download(target_path=os.getcwd(), prefix='UI/2023-04-30_175315_UTC/twodata.db', overwrite=True, show_progress=True)
print("download complete")
# Get a reference to the container
container_name = 'azureml-blobstore-94c291eb-0707-4d58-8ab8-33b5c7edb276'
# UI/2023-04-30_175315_UTC/
# UI/2023-04-30_152949_UTC/
container_client = blob_service_client.get_container_client(container_name)

# Download a specific file (replace 'file_path' with the path of the file you want to download)
# file_path = 'UI/2023-04-30_152949_UTC/twodata.db'
# local_file_name = 'local_twodata.db'
file_path = 'twodata.db'
local_file_name = 'local_twodata.db'

# List blobs in the container
blobs = container_client.list_blobs()

for blob in blobs:
    print("Name:", blob.name)
    print("Size:", blob.size)

# Connect to the downloaded SQLite database file
engine = create_engine(f'sqlite:///{local_file_name}')
# Query all the tables and store them in a dictionary with table names as keys and dataframes as values
tables = {}
inspector = inspect(engine)
table_names = inspector.get_table_names()
print(f"Table names: {table_names}")
# Connect to the downloaded SQLite database file
conn = sqlite3.connect(local_file_name)

downloading datastore
Downloading UI/2023-04-30_175315_UTC/twodata.db
Downloaded UI/2023-04-30_175315_UTC/twodata.db, 1 files out of an estimated total of 1
download complete
Name: UI/2023-04-27_225942_UTC/qqqq.db
Size: 1096105984
Name: UI/2023-04-30_152949_UTC/twodata.db
Size: 1113436160
Name: UI/2023-04-30_175315_UTC/twodata.db
Size: 1113436160
Traceback (most recent call last):
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1963, in _exec_single_context
    self.dialect.do_execute(
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 920, in do_execute
    cursor.execute(statement, parameters)
sqlite3.DatabaseError: database disk image is malformed

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "mltwo.py", line 68, in <module>
    table_names = inspector.get_table_names()
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 397, in get_table_names
    return self.dialect.get_table_names(
  File "<string>", line 2, in get_table_names
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/reflection.py", line 97, in cache
    ret = fn(self, con, *args, **kw)
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/dialects/sqlite/base.py", line 2117, in get_table_names
    names = connection.exec_driver_sql(query).scalars().all()
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1771, in exec_driver_sql
    ret = self._execute_context(
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1841, in _execute_context
    return self._exec_single_context(
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1982, in _exec_single_context
    self._handle_dbapi_exception(
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 2339, in _handle_dbapi_exception
    raise sqlalchemy_exception.with_traceback(exc_info[2]) from e
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/base.py", line 1963, in _exec_single_context
    self.dialect.do_execute(
  File "/anaconda/envs/azureml_py38/lib/python3.8/site-packages/sqlalchemy/engine/default.py", line 920, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.DatabaseError: (sqlite3.DatabaseError) database disk image is malformed
[SQL: SELECT name FROM sqlite_master WHERE type='table' AND name NOT LIKE 'sqlite~_%' ESCAPE '~' ORDER BY name]
(Background on this error at: https://sqlalche.me/e/20/4xp6)
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,915 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sedat SALMAN 13,830 Reputation points
    2023-05-01T07:54:20.35+00:00

    database disk image is malformed = here is an issue with the database file

    Possible Causes are :

    • a mismatch between the data types in the database and the data types expected by the code.
    • There is a mismatch between the structure of the database and the SQL queries that the code is attempting to execute.

    You can try the following

    • check if the downloaded file is identical to the original file in the Blob storage container (you can compare checksums)
    • the data types of the data in the database and the data types expected by the code
    • execute the SQL queries manually using a SQL client, such as SQLite Studio, to see if there are any issues with the queries.
    • create a new database file from scratch and then upload it to the Blob storage container to see if the issue persists
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.