Semana 7 lista

CaptLuque · Dec 17, 2024 · 0b6edaa · 0b6edaa
1 parent e1aa8ce
commit 0b6edaa
Show file tree

Hide file tree

Showing 25 changed files with 7,663 additions and 1,034 deletions.
diff --git a/week6/day1.ipynb b/week6/day1.ipynb
diff --git a/week6/day2.ipynb b/week6/day2.ipynb
diff --git a/week6/day3.ipynb b/week6/day3.ipynb
diff --git a/week6/day4-results.ipynb b/week6/day4-results.ipynb
@@ -5,21 +5,21 @@
    "id": "db8736a7-ed94-441c-9556-831fa57b5a10",
    "metadata": {},
    "source": [
-    "# The Product Pricer Continued\n",
+    "# El evaluador de precios de productos (continuación)\n",
     "\n",
-    "A model that can estimate how much something costs, from its description.\n",
+    "Un modelo que puede estimar cuánto cuesta algo a partir de su descripción.\n",
     "\n",
-    "## Enter The Frontier!\n",
+    "## ¡Nos marchamos a la Frontera!\n",
     "\n",
-    "And now - we put Frontier Models to the test.\n",
+    "Y ahora, ponemos a prueba los modelos de Frontier.\n",
     "\n",
-    "### 2 important points:\n",
+    "### Dos puntos importantes:\n",
     "\n",
-    "It's important to appreciate that we aren't Training the frontier models. We're only providing them with the Test dataset to see how they perform. They don't gain the benefit of the 400,000 training examples that we provided to the Traditional ML models.\n",
+    "Es importante tener en cuenta que no estamos entrenando los modelos de Frontier. Solo les proporcionamos el conjunto de datos de prueba para ver cómo funcionan. No obtienen el beneficio de los 400 000 ejemplos de entrenamiento que proporcionamos a los modelos de ML tradicionales.\n",
     "\n",
-    "HAVING SAID THAT...\n",
+    "DICHO ESTO...\n",
     "\n",
-    "It's entirely possible that in their monstrously large training data, they've already been exposed to all the products in the training AND the test set. So there could be test \"contamination\" here which gives them an unfair advantage. We should keep that in mind."
+    "Es totalmente posible que en sus monstruosos datos de entrenamiento, ya hayan estado expuestos a todos los productos en el conjunto de entrenamiento Y de prueba. Por lo tanto, podría haber una \"contaminación\" de prueba aquí que les dé una ventaja injusta. Debemos tener eso en cuenta."
    ]
   },
   {
@@ -54,8 +54,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# moved our Tester into a separate package\n",
-    "# call it with Tester.test(function_name, test_dataset)\n",
+    "# movimos nuestro Tester a un paquete separado\n",
+    "# lo llamamos mediante Tester.test(function_name, test_dataset)\n",
     "\n",
     "from testing import Tester"
    ]
@@ -67,7 +67,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# environment\n",
+    "# entorno\n",
     "\n",
     "load_dotenv()\n",
     "os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY', 'your-key-if-not-using-env')\n",
@@ -93,7 +93,7 @@
     }
    ],
    "source": [
-    "# Log in to HuggingFace\n",
+    "# Log in en HuggingFace\n",
     "\n",
     "hf_token = os.environ['HF_TOKEN']\n",
     "login(hf_token, add_to_git_credential=True)"
@@ -127,7 +127,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Let's avoid curating all our data again! Load in the pickle files:\n",
+    "# ¡Evitemos tener que volver a curar todos nuestros datos! Carguemos los archivos pickle:\n",
     "\n",
     "with open('train.pkl', 'rb') as file:\n",
     "    train = pickle.load(file)\n",
@@ -141,9 +141,9 @@
    "id": "e5856173-e68c-4975-a769-5f1736e227a5",
    "metadata": {},
    "source": [
-    "# Before we look at the Frontier\n",
+    "# Antes de analizar los modelos Frontera\n",
     "\n",
-    "## There is one more model we could consider"
+    "## Hay un modelo más que podríamos considerar"
    ]
   },
   {
@@ -153,7 +153,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Write the test set to a CSV\n",
+    "# Escribe el conjunto de pruebas en un CSV\n",
     "\n",
     "import csv\n",
     "with open('human_input.csv', 'w') as csvfile:\n",
@@ -169,7 +169,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# Read it back in\n",
+    "# Lo leemos de vuelta\n",
     "\n",
     "human_predictions = []\n",
     "with open('human_output.csv', 'r') as csvfile:\n",
@@ -472,9 +472,9 @@
    "id": "066fef03-8338-4526-9df3-89b649ad4f0a",
    "metadata": {},
    "source": [
-    "## First, the humble but mighty GPT-4o-mini\n",
+    "## Primero, el humilde pero poderoso GPT-4o-mini\n",
     "\n",
-    "It's called mini, but it packs a punch."
+    "Se llama mini, pero es muy potente."
    ]
   },
   {
@@ -484,14 +484,14 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# First let's work on a good prompt for a Frontier model\n",
-    "# Notice that I'm removing the \" to the nearest dollar\"\n",
-    "# When we train our own models, we'll need to make the problem as easy as possible, \n",
-    "# but a Frontier model needs no such simplification.\n",
+    "# Primero, trabajemos en un buen mensaje para un modelo Frontier\n",
+    "# Observe que estoy eliminando el \"al dólar más cercano\"\n",
+    "# Cuando entrenemos nuestros propios modelos, necesitaremos hacer que el problema sea lo más fácil posible,\n",
+    "# pero un modelo Frontier no necesita tal simplificación.\n",
     "\n",
     "def messages_for(item):\n",
-    "    system_message = \"You estimate prices of items. Reply only with the price, no explanation\"\n",
-    "    user_prompt = item.test_prompt().replace(\" to the nearest dollar\",\"\").replace(\"\\n\\nPrice is $\",\"\")\n",
+    "    system_message = \"Estimas los precios de los artículos. Respondes solo con el precio, sin explicaciones.\"\n",
+    "    user_prompt = item.test_prompt().replace(\" al dólar más cercano\",\"\").replace(\"\\n\\nPrice is $\",\"\")\n",
     "    return [\n",
     "        {\"role\": \"system\", \"content\": system_message},\n",
     "        {\"role\": \"user\", \"content\": user_prompt},\n",
@@ -529,7 +529,7 @@
     }
    ],
    "source": [
-    "# Try this out\n",
+    "# Vamos a probarlo\n",
     "\n",
     "messages_for(test[0])"
    ]
@@ -541,7 +541,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# A utility function to extract the price from a string\n",
+    "# Una función de utilidad para extraer el precio de un string\n",
     "\n",
     "def get_price(s):\n",
     "    s = s.replace('$','').replace(',','')\n",
@@ -567,7 +567,7 @@
     }
    ],
    "source": [
-    "get_price(\"The price is roughly $99.99 because blah blah\")"
+    "get_price(\"El precio es de aproximadamente $99,99 porque bla, bla, bla.\")"
    ]
   },
   {
@@ -577,7 +577,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# The function for gpt-4o-mini\n",
+    "# La función para gpt-4o-mini\n",
     "\n",
     "def gpt_4o_mini(item):\n",
     "    response = openai.chat.completions.create(\n",
@@ -895,7 +895,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "# The function for gpt-4o - the August model\n",
+    "# La función para gpt-4o - para el modelo de Agosto\n",
     "\n",
     "def gpt_4o_frontier(item):\n",
     "    response = openai.chat.completions.create(\n",