-
-
-
Collaboration-For-Beginners Public
Forked from udit-001/Collaboration-For-BeginnersA Beginner's Guide to Contributing in an Open Source Project.
MIT License UpdatedMar 11, 2024 -
minbpe Public
Forked from karpathy/minbpeMinimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
-
SeaGOAT Public
Forked from kantord/SeaGOATlocal-first semantic code search engine
-
Awesome-LLM Public
Forked from Hannibal046/Awesome-LLMAwesome-LLM: a curated list of Large Language Model
Creative Commons Zero v1.0 Universal UpdatedMar 7, 2024 -
LWM Public
Forked from LargeWorldModel/LWMLarge World Model (LWM) is a general-purpose large-context multimodal autoregressive model.
-
-
ServerlessWebCrawler Public
Forked from beabetterdevv/ServerlessWebCrawlerA Serverless Web Crawler built on AWS
Python UpdatedJan 9, 2024 -
e2e-data-engineering Public
Forked from airscholar/e2e-data-engineeringAn end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All comp…
Python UpdatedOct 5, 2023 -
crypto_api_kafka_airflow_streaming Public
Forked from dogukannulu/crypto_api_kafka_airflow_streamingGet Crypto data from API, stream it to Kafka with Airflow. Write data to MySQL and visualize with Metabase
Python UpdatedOct 2, 2023 -
airflow_kafka_cassandra_mongodb Public
Forked from dogukannulu/airflow_kafka_cassandra_mongodbProduce Kafka messages, consume them and upload into Cassandra, MongoDB.
Python UpdatedSep 26, 2023 -
aws_end_to_end_streaming_pipeline Public
Forked from dogukannulu/aws_end_to_end_streaming_pipelineAn AWS Data Engineering End-to-End Project (Glue, Lambda, Kinesis, Redshift, QuickSight, Athena, EC2, S3)
Python UpdatedSep 20, 2023 -
send_data_to_aws_services Public
Forked from dogukannulu/send_data_to_aws_servicesThis repo automates the processes when we want to send remote data to AWS services such as Kinesis, S3, etc.
Python UpdatedAug 27, 2023 -
glue_etl_job_data_catalog_s3 Public
Forked from dogukannulu/glue_etl_job_data_catalog_s3Glue ETL job or EMR Spark that gets from data catalog, modifies and uploads to S3 and Data Catalog
Jupyter Notebook UpdatedAug 26, 2023 -
s3_trigger_lambda_to_rds Public
Forked from dogukannulu/s3_trigger_lambda_to_rdsSend a dataframe to S3 automatically, trigger Lambda and modify dataframe, upload to RDS
Python UpdatedAug 5, 2023 -
streaming_data_processing Public
Forked from dogukannulu/streaming_data_processingCreate a streaming data, transfer it to Kafka, modify it with PySpark, take it to ElasticSearch and MinIO
Python UpdatedJul 21, 2023 -
open_llama Public
Forked from openlm-research/open_llamaOpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
Apache License 2.0 UpdatedJul 16, 2023 -
32-Verilog-Mini-Projects Public
Forked from sudhamshu091/32-Verilog-Mini-ProjectsImplementing 32 Verilog Mini Projects.
-