Skip to content
View avgra3's full-sized avatar

Block or report avgra3

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
avgra3/README.md

Antony G.

I am a naturally currious person who enjoys working with data and making useful tools. Throughout my career working with data, I have made multiple tools for myself and team to drive productivity and ease friction when working.

My Github is a short list of things I have created for myself and for work.

Project Highlights

Audio Converter

Since early 2024 I have gotten into listening to audiobooks. However, when using purchased audiobooks from Audible, I ran into issues with their proprietary file format on my laptop. To get around this, I looked for tools to help me convert my files to an easier to play format like mp4. I however, was curious of how to do this myself. This led me using FFMPEG to convert my files. FFMPEG is great but using the cli is a bit confusing and I was going to do the samme thing for multiple files. Thus AudioConverter was born.

This project is still in development but is functional as I am personally using it and making tweaks. This taught me how to use the file explorer to grab the path of a file and making a nice TUI. This project also usese C# for all logic.

Database Changes

I have made a tool that allows for efficiently changing a databases parameters. It has been modified for different uses cases that include:

  • Convert a database's engine to MyIsam from Aria
  • Convert a database's collation from latin1 to utf8mb4
  • Convert all char fields to varchar

This tool uses Python multiprocessing to automate getting a list of tables that need to be changed and then breaking that list into roughly equal sublists. The number of lists depends on the number of CPU cores available to the machine. It then generates the SQL required to make the changes to the tables and executes those SQL scripts in parallel. This tool has made making database wide changes easier for both myself and my collegues. Check it out the open version here.

Automation of ETL

Using Python to automate the ETL processing of new data. This project focused on automating the transformation of the data to be ready to send to our production databases. This essentially automates the processes that would normally need to be executed manually.

This was originally a process that was done using Alteryx but I moved it to use Python and SQL templating. This made it more modular and easier to use. I also added a configuration TOML file that is the only item that would need to be updated if database parameters change (username, password, hostname, etc.).

As a final update to the project, I made sure to use the Python multiprocessing module to run the 3 different portions run on the database. Greatly reducing the time it previously took to processes this dataset. In order to make it easier for anyone who uses this tool, I added logging and a simple CLI with a help command.

Pinned Loading

  1. MariaDB-Context-Manager MariaDB-Context-Manager Public

    A context manager to use with Python to easily connect and run querries

    Python

  2. cswc cswc Public

    A C# implementation of the wc command line tool on Linux

    C#

  3. fwparser fwparser Public

    A simple tool to parse fixed width files using Python

    Python

  4. AudioConverter AudioConverter Public

    A simple audio conversion tool written in C#

    C#