Skip to content

Golang app that seeds AWS DynamoDB with songs and lyrics for a single artist.

License

Notifications You must be signed in to change notification settings

jseashell/lyrics-db-seeder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lyrics Database Seeder

Golang app that seeds AWS DynamoDB with lyrics (categorized by song) for a single artist.

Running the App

  1. Install Go.

  2. Configure the AWS CLI on your local workstation.

  3. This app stores an artist's songs and lyrics into separate DynamoDB tables. Create tables for "songs" and "lyrics".

  4. Clone the respository

    git clone git@github.com:jseashell/lyrics-db-seeder.git
    cd lyrics-db-seeder
  5. Create an .env file and add the necessary values.

    cp .env.example .env
    • GENIUS_ACCESS_TOKEN: Visit https://docs.genius.com/. Sign up for a developer account, create a new API client, and "Generate Token" for that client (do not use the client ID/secret).
    • ARTIST: Name of the artist to collect.
    • INCLUDE_FEATURED: Indicates whether to scrape lyrics when GENIUS_PRIMARY_ARTIST is listed as a featured artist. This can greatly increase the amount of data to be processed.
    • INCLUDE_ANDED: Indicates whether to scrape lyrics when GENIUS_PRIMARY_ARTIST is listed "and another artist". This can greatly increase the amount of data to be processed.
    • AFFILIATIONS: List of affiliations to include in collections. Affiliations help the search engine, but searching will yield both explicit and implicit affiliations, or empty string. This can greatly increase the amount of data to be processed.
    • LOG_LEVEL: Log level. Supports "DEBUG", "INFO", "WARN", or "ERROR".
    • AWS_DYNAMODB_SONGS_TABLE_NAME: Name of the table in which to save songs.
    • SKIP_DB: Skips database operations. Typically used for debugging and verification before incurring AWS costs.
  6. Run the app

    # clean, build
    make
    
    # clean, build, run
    make go

Project Structure

.
├── cmd
│   └── main.go             # entry point
├── docs                    # repo documentation
├── internal                # internal packages
│   ├── db                  # dynamodb operations
│   ├── genius              # genius.com integration
│   └── scraper             # web scraper
├── .env.example            # example environment file
├── .gitignore
├── go.mod                  # module dependencies
├── go.sum                  # dependency checksums
├── LICENSE
├── Makefile
└── README.md

AWS

Performance will vary depending on your DynamoDB read/write capacity settings and your network connection.

3rd party libraries

  • aws-sdk-go-v2 - AWS SDK for the Go programming language.
  • google/uuid - RFC-4122 compliant UUID module by Google.
  • dotenv - A Go (golang) port of the Ruby dotenv project.
  • colly - Lightning Fast and Elegant Scraping Framework for Gophers.

Disclaimer

Repository contributors are not responsible for costs incurred by AWS services.

License

This software is distributed under the terms of the MIT License.

About

Golang app that seeds AWS DynamoDB with songs and lyrics for a single artist.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published