Skip to content

briansu2004/MyAI

Repository files navigation

My AI

My AI Skills

LLM, Prompting, Markdown Prompts Framework; Machine Learning Algorithms, linear regression, logistic regression, decision trees, random forests, support vector machines, k-nearest neighbors, naive Bayes; Data Preprocessing, Model Evaluation and Validation, Feature Selection and Engineering, Model Deployment, TensorFlow; Data Science, Python (NumPy, Pandas, Scikit-learn, TensorFlow, Keras), R, Jupyter Notebooks; Data visualization libraries, Matplotlib, Seaborn, and Plotly; NLP; Computer Vision, PyTorch; IBM Watson, Azure AI platform, AWS AI platform, GCP AI platform; LangChain

  1. Machine Learning Algorithms: linear regression, logistic regression, decision trees, random forests, support vector machines, k-nearest neighbors, naive Bayes, gradient boosting, and neural networks (including deep learning architectures such as CNNs, RNNs, and LSTMs).

  2. Data Preprocessing: feature scaling, feature engineering, dimensionality reduction (PCA, LDA), and handling missing data.

  3. Model Evaluation and Validation: evaluating and validating machine learning models using techniques like cross-validation, hyperparameter tuning, and metrics such as accuracy, precision, recall, F1-score, ROC-AUC, and mean squared error (MSE).

  4. Feature Selection and Engineering: selecting relevant features, creating new features, and performing feature engineering tasks like one-hot encoding, label encoding, and handling categorical variables.

  5. Model Deployment: deploying machine learning models into production environments using frameworks like TensorFlow Serving, Flask, Django, or cloud-based deployment platforms like AWS SageMaker or Google Cloud AI Platform.

  6. Data Science Tools: Python (NumPy, Pandas, Scikit-learn, TensorFlow, Keras), R, Jupyter Notebooks, and data visualization libraries like Matplotlib, Seaborn, and Plotly.

  7. Data Analysis: exploratory data analysis (EDA) and draw insights from data using statistical methods, visualization techniques, and domain knowledge.

  8. Natural Language Processing (NLP): text classification, sentiment analysis, named entity recognition, topic modeling, and building chatbots or virtual assistants.

  9. Computer Vision: image classification, object detection, image segmentation, and building models with frameworks like OpenCV, TensorFlow, or PyTorch.

  10. Reinforcement Learning: Q-learning, deep Q-networks (DQN), and policy gradients.

My AI Projects

Project 3

Developed a sophisticated knowledge-based ontology graph database application tailored for a federal client's needs.

  • Seamlessly transitioned from a traditional relational data model to a dynamic Graph data model, leveraging the robust capabilities of Neo4j for enhanced data representation and querying efficiency.
  • Implemented a comprehensive Natural Language Processing (NLP) pipeline utilizing cutting-edge Python libraries and Azure Cognitive Services APIs. This pipeline adeptly analyzed extensive datasets, extracting pertinent entities, identifying key phrases, categorizing content, and performing sentiment analysis.
  • Leveraged advanced data science algorithms within Jupyter Notebook to derive meaningful insights from the analyzed data, empowering decision-makers with actionable intelligence.
  • Utilized Neo4j Bloom to visualize intricate ontology knowledge graphs, facilitating intuitive exploration and understanding of complex relationships within the data.
  • Expanded the scope of data acquisition by integrating additional feeds from diverse social media platforms. Leveraged the updated data to re-run the NLP pipeline, ensuring that insights remained current and reflective of evolving trends and sentiments across various digital channels.

Project 2

Developed a Python-based AI project for a government client aimed at predicting the likelihood of payment for issued parking tickets.

  • Gathered diverse datasets from multiple sources to ensure comprehensive coverage and accuracy in training the predictive model.
  • Employed robust data preprocessing techniques to filter out outliers and ensure the integrity of the dataset, enhancing the reliability of subsequent analyses.
  • Split the dataset into distinct training and testing sets to enable the evaluation of model performance on unseen data, mitigating against overfitting and ensuring generalizability.
  • Leveraged a variety of machine learning algorithms to train predictive models, rigorously evaluating their performance against established metrics.
  • Utilized Area Under the ROC Curve (AUC-ROC) as a primary evaluation metric to identify the most effective model in predicting the likelihood of payment for parking tickets.
  • Selected the best-performing model based on AUC-ROC scores, ensuring optimal predictive accuracy and reliability for informing decision-making processes within the government agency.

Project 1

Developed a Python data science project tailored for a prominent supermarket chain client to optimize their store location strategy and enhance customer experience.

  • Initiated the project by gathering comprehensive neighborhood data from Wikipedia, leveraging web scraping techniques to extract relevant information such as demographics, population density, and socioeconomic indicators.

  • Augmented the neighborhood dataset by incorporating geographical coordinates using the Geocoder library, facilitating precise mapping and analysis of locations.

  • Employed the powerful capabilities of the Foursquare Places API to retrieve detailed location and venue data, including nearby amenities, competitors, and popular attractions within each neighborhood.

  • Utilized advanced data preprocessing techniques to clean and prepare the dataset for analysis, including handling missing values, normalizing features, and encoding categorical variables.

  • Applied k-means clustering algorithm to segment neighborhoods based on similarities in venue characteristics, enabling the identification of distinct clusters representing different customer preferences and market segments.

  • Conducted thorough exploratory data analysis (EDA) to gain insights into the distribution of clusters and understand the underlying patterns driving customer behavior and preferences.

  • Leveraged visualization tools such as matplotlib and seaborn to create informative visualizations, including scatter plots, heatmaps, and cluster maps, to effectively communicate findings and insights to stakeholders.

  • Provided actionable recommendations to the client based on the clustering results, including insights into optimal store locations, potential expansion opportunities, and strategies for improving customer engagement and retention.

My AI Platforms

Amazon Web Services (AWS)

  1. Amazon SageMaker: A fully managed machine learning service that enables developers to build, train, and deploy machine learning models at scale.
  2. Amazon Lex: A service for building conversational interfaces into any application using voice and text.
  3. Amazon Polly: A text-to-speech service that uses advanced deep learning technologies to synthesize speech that sounds like a human voice.
  4. Amazon Rekognition: A service for adding image and video analysis to applications, including object and scene detection, facial analysis, and celebrity recognition.
  5. Amazon Comprehend: A natural language processing (NLP) service for extracting insights and relationships from unstructured text.
  6. Amazon Transcribe: A service for converting speech to text, enabling developers to transcribe audio files or streams into text in real time.
  7. AWS DeepLens: A deep learning-enabled video camera that allows developers to experiment with and deploy deep learning models on the edge.
  8. AWS DeepRacer: A fully autonomous 1/18th scale race car driven by reinforcement learning models, designed to help developers learn about reinforcement learning and machine learning in a fun way.
  9. AWS Deep Learning AMIs: Preconfigured Amazon Machine Images (AMIs) that come with popular deep learning frameworks and libraries installed, allowing developers to quickly set up deep learning environments on EC2 instances.

Google Cloud Platform (GCP)

  1. Google Cloud AI Platform: A suite of machine learning services that enables developers to build, train, and deploy machine learning models at scale.
  2. Google Cloud AutoML: A suite of machine learning products that enables developers with limited machine learning expertise to train high-quality custom models for specific use cases, such as image recognition and natural language processing.
  3. Google Cloud Vision API: A service for integrating image recognition and analysis capabilities into applications.
  4. Google Cloud Speech-to-Text: A service for converting speech into text in over 120 languages and variants.
  5. Google Cloud Text-to-Speech: A service for converting text into natural-sounding speech in over 30 languages and variants.
  6. Google Cloud Translation API: A service for translating text between languages using pre-trained models or custom machine learning models.
  7. Google Cloud Natural Language API: A service for analyzing and extracting insights from unstructured text using machine learning models.
  8. Google Cloud Video Intelligence API: A service for annotating and analyzing videos using machine learning models to detect objects, faces, and activities.
  9. Google Cloud Speech Command Recognition: A service for recognizing speech commands in audio data using machine learning models, designed for IoT and other edge device applications.
  10. Google Cloud AI Hub: A hosted repository and collaboration platform for discovering, sharing, and deploying machine learning models and datasets.

Microsoft Azure

  1. Azure Machine Learning: A cloud-based service for building, training, and deploying machine learning models at scale, with support for various programming languages and frameworks.
  2. Azure Cognitive Services: A collection of AI-powered APIs and pre-built models for vision, speech, language, and decision-making, enabling developers to add intelligent features to their applications.
  3. Azure Bot Service: A service for building, testing, and deploying chatbots across multiple channels, with built-in natural language understanding (NLU) capabilities.
  4. Azure Cognitive Search: A fully managed search-as-a-service solution that uses AI to provide relevant search results and enrich content with semantic understanding.
  5. Azure Speech Service: A cloud-based service for speech recognition and speech synthesis, with support for custom speech models and real-time transcription.
  6. Azure Computer Vision: A service for analyzing and extracting insights from images and videos, including object detection, image recognition, and image classification.
  7. Azure Language Understanding (LUIS): A service for building natural language understanding models to interpret user intents and extract relevant information from text inputs.
  8. Azure Personalizer: A reinforcement learning-based service for personalizing content and recommendations in applications, based on user interactions and feedback.
  9. Azure Video Analyzer: A service for analyzing and extracting insights from videos, including object tracking, motion detection, and scene understanding.
  10. Azure IoT Edge: A service for deploying AI models and analytics to edge devices, enabling real-time processing and decision-making at the edge of the network.

IBM Watson

  1. Watson Assistant: A conversational AI platform that enables developers to build and deploy chatbots and virtual assistants across multiple channels.
  2. Watson Discovery: A service for analyzing unstructured data and extracting insights using natural language processing (NLP) and machine learning.
  3. Watson Language Translator: A service for translating text between languages using advanced machine learning models.
  4. Watson Natural Language Understanding: A service for analyzing text and extracting metadata such as entities, keywords, and sentiment.
  5. Watson Speech to Text: A service for converting spoken language into text in real time, with support for multiple languages and audio formats.
  6. Watson Text to Speech: A service for synthesizing natural-sounding speech from text in multiple languages and voices.
  7. Watson Visual Recognition: A service for analyzing and classifying images using machine learning models, with support for custom image classifiers.
  8. Watson Studio: A cloud-based platform for building and deploying machine learning models, with support for data preparation, model training, and model deployment.
  9. Watson Knowledge Catalog: A service for managing and curating data assets, including datasets, models, and notebooks, to facilitate collaboration and reuse.
  10. Watson OpenScale: A service for monitoring and managing AI models in production, including bias detection, model fairness, and performance monitoring.

Sponsor My AI Project - Predicting and Addressing Criminal Activities

I need some sponsors!

Here is my WIP project:

  • Gather the data from the history for all criminal events in my city
  • Take the live feed from different resources (e.g. police reports, call centers, governments, witnesses etc.)
  • Do the data cleansing and remove outliners
  • Split the data set for training and test data
  • Choose the best ML algorithm and use it to analyze the data
  • Predict the highest freqenent days/hours for all particular zones
  • Notify the police stations and recommend them to prepare more on the crime peek days/hours

Here's a high-level overview of how I will proceed:

  1. Data Gathering: Collect historical data on criminal events in our city from relevant sources such as police records, crime databases, news reports, and government statistics. Additionally, set up mechanisms to gather live data from sources like police reports, emergency call centers, government agencies, and eyewitness accounts.

  2. Data Cleansing and Preprocessing: Cleanse the collected data to remove inconsistencies, errors, and outliers. This may involve tasks such as handling missing values, standardizing data formats, and identifying and removing erroneous entries.

  3. Data Splitting: Split the cleaned dataset into training and testing subsets. The training dataset will be used to train our machine learning model, while the testing dataset will be used to evaluate its performance.

  4. Model Selection and Training: Choose the most appropriate machine learning algorithm for our problem. This could involve experimenting with different algorithms such as decision trees, random forests, support vector machines, or neural networks. Train the selected model using the training dataset.

  5. Model Evaluation: Evaluate the performance of our trained model using the testing dataset. Common evaluation metrics for classification tasks include accuracy, precision, recall, and F1 score. Choose the evaluation metrics that are most relevant to our project's objectives.

  6. Prediction: Once our model is trained and evaluated, use it to make predictions on new data. In this case, use the trained model to predict the highest frequency days and hours for criminal events in particular zones of our city.

  7. Notification and Recommendation: Develop a mechanism to notify relevant authorities, such as police stations, about the predicted high-frequency days and hours for criminal activity in specific zones. Additionally, provide recommendations on how they can prepare and allocate resources accordingly.

  8. Deployment and Monitoring: Deploy our model and notification system in a production environment. Continuously monitor the system's performance and update the model as needed with new data and insights.

  9. Feedback Loop: Establish a feedback loop to collect data on the effectiveness of our predictions and recommendations. Use this feedback to refine and improve our model over time.

  10. Documentation and Maintenance: Document the entire project, including data sources, preprocessing steps, model selection, training, and deployment processes. Regularly maintain and update our system to ensure its effectiveness and relevance over time.

Releases

No releases published

Packages

No packages published