ArtBloom: A Hands-On Guide to Creating an Intelligent Art Discovery Backend

Project Description

ArtBloom is a backend application designed for art enthusiasts and researchers. It provides robust APIs for browsing, discovering, and analyzing artwork collections. The application integrates seamlessly with the Art Institute of Chicago's open API and offers high-performance endpoints for metadata retrieval and data analysis.

Why Python for Backend Development?

Python is one of the most popular programming languages for backend development due to its simplicity, readability, and vast ecosystem of libraries and frameworks. Here’s why Python is an excellent choice for backend applications:

Ease of Use: Python's clean syntax and readability make it easy to learn and write.
Rich Ecosystem: Frameworks like Sanic, Flask, and Django simplify the development process.
Performance: With asynchronous libraries like Sanic and aiohttp, Python can handle high-concurrency applications efficiently.
Integration: Python offers seamless integration with databases, APIs, and other services.
Community Support: A large, active community ensures quick access to tutorials, documentation, and support.

By using Python, you can focus on solving business problems rather than dealing with the complexities of low-level programming.

Learning Objectives

By following this guide, you will:

Understand the fundamentals of backend development with Python.
Learn how to set up and configure a Sanic server.
Use Tortoise ORM to interact with a PostgreSQL database.
Write and expose RESTful API endpoints.
Apply asynchronous programming to handle high-concurrency scenarios.
Debug, test, and optimize a Python backend application.
Explore data analysis techniques using Pandas for backend recommendations.

Key Features

API Endpoints

The application exposes the following endpoints using Sanic:

Fallback Endpoint: Handles non-existent paths and returns a 404 error.
```
GET /
Response: { "message": "This path does not exist." }
```

Retrieve All Artworks: Fetch a list of artworks based on query parameters.

GET /artworks
Query Parameters: filters such as medium, artist, year, etc.
Response: JSON containing artwork data.

Retrieve Artwork by ID: Get detailed information about a specific artwork.

GET /artworks/<artwork_id>
Response: JSON containing the artwork's metadata or a 404 error if not found.

Search Artworks: Search artworks based on specific criteria.

GET /artworks/search
Query Parameters: search keywords.
Response: JSON containing matching artworks.

Recommendation Route: Generate artwork recommendations using Pandas to analyze metadata.
```
GET /artworks/recommendations
Response: JSON containing recommended artworks.
```

These endpoints are implemented in the artworks_router module and rely on helper functions for data retrieval and processing.

Technology Stack

Backend Framework: Sanic for asynchronous web APIs.
Database: PostgreSQL with Tortoise ORM.
Migration Tool: Aerich for database schema management.
API Integration: HTTPx for fast and reliable integration.
Environment Management: Python-dotenv for configuration.
Testing: PyUnit for unit and integration testing.
Linting and Formatting: Pylint and isort for code quality.
Data Analysis: Pandas for advanced data manipulation.

Project Structure

art-bloom/
├── artworks_core/
│   ├── models/
│   │   ├── __init__.py
│   │   ├── artworks_data_helper.py
│   │   ├── artworks_data_reader.py
│   │   ├── artworks_data_writer.py
│   │   └── artworks_router.py
│   ├── artworks_settings/
│   │   ├── __init__.py
│   │   ├── generate_tortoise_config.py
│   │   ├── setup_env_configuration.py
│   │   └── tortoise_config_wrapper.py
│   ├── artworks_utils/
│   │   ├── __init__.py
│   │   ├── app_logger.py
│   │   ├── database_manager.py
│   │   ├── http_request_manager.py
│   │   └── logging.conf
│   └── migrations/
├── tests/
│   ├── __init__.py
│   └── test_format_artworks.py
├── .env
├── .gitignore
├── .pylintrc
├── app.log
├── LICENSE
├── Makefile
├── Pipfile
├── Pipfile.lock
├── pyproject.toml
├── README.md
└── server.py

Setup and Installation

Prerequisites

Python 3.12+
PostgreSQL
Pipenv

Steps

Clone the Repository:

git clone https://github.com/yourusername/artbloom.git
cd art-bloom

Install Dependencies:
```
pipenv install
pipenv shell
```

Set Up Environment Variables: Create a .env file in the root directory with the following content:

APP_NAME=ArtBloom
DATABASE_URL=postgres://<username>:<password>@localhost:5432/artbloom
ARTWORKS_API=https://api.artic.edu/api/v1/artworks
SECRET_KEY=your_secret_key
DEBUG=True

Initialize the Database:

aerich init -t artworks_settings.tortoise_config_wrapper.TORTOISE_ORM
aerich init-db

Run the Application:
```
make run
```

Usage

Access the API at http://localhost:4000.
Use the provided endpoints to interact with artwork data.
Log outputs are stored in app.log for debugging and monitoring.

Testing and Linting

Run Tests:
```
make test
```
Lint the Code:
```
make lint
```

Recommendation System

Overview

The recommendation system in ArtBloom dynamically generates artwork suggestions based on user preferences and inferred interests. It combines metadata filtering, temporal diversity, and ranking to provide meaningful recommendations.

How It Works

User Preferences:
- The system starts with explicitly provided preferences, such as:
  - style_title: Artistic style (e.g., "Post-Impressionism").
  - medium_display: Medium used (e.g., "Oil on canvas").
  - category_titles: Categories (e.g., "Painting and Sculpture of Europe").

Filter Artworks:

Artworks are filtered based on the user’s preferences, and a scoring mechanism is applied to prioritize matches.

filtered_artworks["score"] = (
    (filtered_artworks["style_title"] == user_preferences["style_title"]).astype(int) * 3 +
    (filtered_artworks["medium_display"] == user_preferences["medium_display"]).astype(int) * 2 +
    (filtered_artworks["category_titles"].apply(lambda x: any(cat in x for cat in user_preferences["category_titles"]))).astype(int)
)

Temporal Inference:
- Infer the user’s preferred time period by averaging the creation dates (avg_date) of filtered artworks.
```
user_date_preference = filtered_artworks["avg_date"].mean()
```

Rank Artworks by Score and Temporal Similarity:

Rank artworks based on their score and proximity to the inferred time period.

filtered_artworks["date_similarity"] = np.abs(filtered_artworks["avg_date"] - user_date_preference)
filtered_artworks = filtered_artworks.sort_values(by=["score", "date_similarity"], ascending=[False, True]).head(5)

Select Top Recommendations:
- The top artworks are selected and formatted for the response.

Example Workflow

Artworks Data

ID	Title	Style Title	Medium Display	Date Start	Date End	Avg Date	Score
1	Starry Night	Post-Impressionism	Oil on canvas	1889	1889	1889.0	6
2	Wheatfield with Crows	Post-Impressionism	Oil on canvas	1890	1890	1890.0	6
3	Water Lilies	Impressionism	Oil on canvas	1875	1876	1875.5	3
4	Red Eiffel Tower	Modernism	Oil on canvas	1911	1911	1911.0	2

User Preferences

{
  "style_title": "Post-Impressionism",
  "medium_display": "Oil on canvas",
  "category_titles": ["Painting and Sculpture of Europe"]
}

Steps and Results

Filter Artworks: Match artworks based on style, medium, and category preferences, and calculate scores.
Infer Temporal Preference: Calculate the user’s preferred date range.
```
user_date_preference = mean([1889.0, 1890.0]) = 1889.5
```
Rank by Temporal Similarity: Compute temporal similarity and rank by score and similarity.

ID Avg Date Score Date Similarity

1 1889.0 6 0.5

2 1890.0 6 0.5

3 1875.5 3 14.0

4 1911.0 2 21.5
Select Top Recommendations: Return the top-ranked artworks.

Key Features

Prioritized Matching: Ensures that artworks with the highest alignment to preferences are ranked first.
Temporal Diversity: Balances relevance and diversity by including artworks near the inferred time period.
Scalable Logic: Easily extendable to incorporate additional metadata or preferences.

Deployment

Local Deployment

Run the server using the Makefile target make run.
Ensure PostgreSQL is running locally with the configured database.

Free Hosting Platforms

Render: Free tier for hosting backend apps.
Railway: Offers $5/month free credits.
Fly.io: Docker-based deployment with global regions.

Additional Resources

Sanic Documentation
Tortoise ORM Documentation
PostgreSQL Official Site
Python-dotenv Documentation
Pandas Documentation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ArtBloom: A Hands-On Guide to Creating an Intelligent Art Discovery Backend

Table of Contents

Project Description

Why Python for Backend Development?

Learning Objectives

Key Features

API Endpoints

Technology Stack

Project Structure

Setup and Installation

Prerequisites

Steps

Usage

Testing and Linting

Recommendation System

Overview

How It Works

Example Workflow

Artworks Data

User Preferences

Steps and Results

Key Features

Deployment

Local Deployment

Free Hosting Platforms

Additional Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

ArtBloom: A Hands-On Guide to Creating an Intelligent Art Discovery Backend

Table of Contents

Project Description

Why Python for Backend Development?

Learning Objectives

Key Features

API Endpoints

Technology Stack

Project Structure

Setup and Installation

Prerequisites

Steps

Usage

Testing and Linting

Recommendation System

Overview

How It Works

Example Workflow

Artworks Data

User Preferences

Steps and Results

Key Features

Deployment

Local Deployment

Free Hosting Platforms

Additional Resources