Skip to content
@southern-cross-ai

Southern Cross AI

Australia's First Large Language Model Research Initiative
Southern Cross AI   Website   Discord   Website

✨ Welcome to Southern Cross AI ✨
We aim at developing Australia's First Open-Source Large Language Model
through collaborations across academia, research, government, and business sectors.

Wanna make friends and munch some snacks? Let's Meetup!

Join our exciting 12-week (Aug 5 - Oct 7) Meetup events held every Monday:

Update (Oct 8): Unfortunately, our 12-week journey has come to an end. A heartfelt ❤️ thank you to all the community members who joined us over the past few months. It’s been an amazing experience sharing this time with you! Stay tuned to our Meetup page, and we’ll see you all next semester : )

New kid in town? No worries, we got you!

Onboard LLMs

LLM Battleground

LLM Playground

Misc

Call for Contributors - We need your magic to make things happen

  • Data Source Contributor 🕵️‍♀️
    • Identify and provide access to Australia-related data sources.
    • Collaborate with other contributors to ensure data quality and relevance.
  • Data Collecting, Crawling and Scraping 👩‍🌾
    • Develop scripts and tools to collect data from various sources.
    • (Optional) Have experience with web scraping tools (e.g., BeautifulSoup, Scrapy).
  • Data Cleaning 👩‍⚕️
    • Clean and preprocess datasets to ensure they are ready for analysis and modeling.
    • (Optional) Have experience with data manipulation libraries (e.g., Pandas, NumPy).
  • Model Building, Training and Tuning 👩‍💻
    • Develop and train LLMs to solve with our datasets.
    • Have experience with machine learning frameworks (e.g., TensorFlow, PyTorch).
  • GitHub Organising 👩‍🔧
    • Manage the GitHub repository by organizing files, documentation, and issues.
    • (Optional) Have proficiency in using Git and GitHub.
  • Hugging Face Organising 👩‍🏭
    • Manage and organize model versions and datasets.
    • Ensure proper documentation and metadata for each model and dataset.
  • Social Media Organising 👩‍💼
    • Promote the project and its updates on social media platforms (e.g., Discord, Meetup).
    • Engage with the community to increase project visibility and collaboration.

Can't wait to join us? Send a message to our lovely team members:

Pinned Loading

  1. BabyJoey BabyJoey Public

    Small 115 million parameter model - .5GB

    Python 4 9

  2. Gutenberg-Data Gutenberg-Data Public

    HTML 3 2

  3. Dataset-Repo-Template Dataset-Repo-Template Public template

    A Template for Creating Your Dataset Repos

    1

Repositories

Showing 10 of 35 repositories
  • website Public

    Southern Cross AI's open-source website

    southern-cross-ai/website’s past year of commit activity
    TypeScript 3 7 0 0 Updated Oct 14, 2024
  • BabyJoey Public

    Small 115 million parameter model - .5GB

    southern-cross-ai/BabyJoey’s past year of commit activity
    Python 4 Apache-2.0 9 7 0 Updated Oct 9, 2024
  • .github Public

    These are the default community health files for Southern Cross AI's GitHub profile.

    southern-cross-ai/.github’s past year of commit activity
    0 Apache-2.0 0 0 0 Updated Oct 8, 2024
  • Braided-Channels Public

    Interview Dateset from the Braided Channels Research Collection

    southern-cross-ai/Braided-Channels’s past year of commit activity
    Jupyter Notebook 1 MIT 0 1 0 Updated Sep 7, 2024
  • OpenAustralia Public

    Dataset of House and Senate Debates from Australian Parliament

    southern-cross-ai/OpenAustralia’s past year of commit activity
    HTML 1 MIT 0 1 0 Updated Sep 5, 2024
  • Inside-Airbnb-Australia Public

    Airbnb's Residential Dataset (Australia)

    southern-cross-ai/Inside-Airbnb-Australia’s past year of commit activity
    Jupyter Notebook 1 MIT 0 1 0 Updated Sep 5, 2024
  • ICE-AUS Public

    Corpus Dataset from Australian component of the International Corpus of English (ICE-AUS)

    southern-cross-ai/ICE-AUS’s past year of commit activity
    Python 1 MIT 0 0 0 Updated Sep 5, 2024
  • CoANZSE Public

    Dataset from Corpus of Australian and New Zealand Spoken English (CoANZSE)

    southern-cross-ai/CoANZSE’s past year of commit activity
    Python 1 0 0 0 Updated Sep 5, 2024
  • southern-cross-ai/Dewr-data’s past year of commit activity
    0 0 1 0 Updated Aug 27, 2024
  • AU-website Public

    This is the fo

    southern-cross-ai/AU-website’s past year of commit activity
    Python 0 MIT 1 1 1 Updated Aug 25, 2024

Top languages

Loading…

Most used topics

Loading…