Golang app that seeds AWS DynamoDB with lyrics (categorized by song) for a single artist.
-
Configure the AWS CLI on your local workstation.
-
This app stores an artist's songs and lyrics into separate DynamoDB tables. Create tables for "songs" and "lyrics".
-
Clone the respository
git clone git@github.com:jseashell/lyrics-db-seeder.git cd lyrics-db-seeder
-
Create an
.env
file and add the necessary values.cp .env.example .env
GENIUS_ACCESS_TOKEN
: Visit https://docs.genius.com/. Sign up for a developer account, create a new API client, and "Generate Token" for that client (do not use the client ID/secret).ARTIST
: Name of the artist to collect.INCLUDE_FEATURED
: Indicates whether to scrape lyrics when GENIUS_PRIMARY_ARTIST is listed as a featured artist. This can greatly increase the amount of data to be processed.INCLUDE_ANDED
: Indicates whether to scrape lyrics when GENIUS_PRIMARY_ARTIST is listed "and another artist". This can greatly increase the amount of data to be processed.AFFILIATIONS
: List of affiliations to include in collections. Affiliations help the search engine, but searching will yield both explicit and implicit affiliations, or empty string. This can greatly increase the amount of data to be processed.LOG_LEVEL
: Log level. Supports "DEBUG", "INFO", "WARN", or "ERROR".AWS_DYNAMODB_SONGS_TABLE_NAME
: Name of the table in which to save songs.SKIP_DB
: Skips database operations. Typically used for debugging and verification before incurring AWS costs.
-
Run the app
# clean, build make # clean, build, run make go
.
├── cmd
│ └── main.go # entry point
├── docs # repo documentation
├── internal # internal packages
│ ├── db # dynamodb operations
│ ├── genius # genius.com integration
│ └── scraper # web scraper
├── .env.example # example environment file
├── .gitignore
├── go.mod # module dependencies
├── go.sum # dependency checksums
├── LICENSE
├── Makefile
└── README.md
Performance will vary depending on your DynamoDB read/write capacity settings and your network connection.
- aws-sdk-go-v2 - AWS SDK for the Go programming language.
- google/uuid - RFC-4122 compliant UUID module by Google.
- dotenv - A Go (golang) port of the Ruby dotenv project.
- colly - Lightning Fast and Elegant Scraping Framework for Gophers.
Repository contributors are not responsible for costs incurred by AWS services.
This software is distributed under the terms of the MIT License.