Skip to content
This repository has been archived by the owner on Jun 3, 2021. It is now read-only.

Commit

Permalink
edit regex
Browse files Browse the repository at this point in the history
  • Loading branch information
martindaniel4 committed Feb 27, 2015
1 parent d0a3b94 commit 70cfbf6
Showing 1 changed file with 55 additions and 68 deletions.
123 changes: 55 additions & 68 deletions day3/03 - Focus sur les expressions régulières (Regex).ipynb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:83d5340cfc6187353dd41abead5a22c1c117ec3bf55e5a3d46574a2d709ed91d"
"signature": "sha256:b727278faf4a59e5696fe4f63522e8620a6dd808092805f8fb1ca4f3dc3ec87c"
},
"nbformat": 3,
"nbformat_minor": 0,
Expand Down Expand Up @@ -76,7 +76,7 @@
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 14
"prompt_number": 2
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -137,26 +137,23 @@
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>6 rows \u00d7 1 columns</p>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 40,
"prompt_number": 3,
"text": [
" noms&mails\n",
"0 Martin Daniel\n",
"1 martin@gmail.com\n",
"2 Vincent Simon\n",
"3 vincent.simon@laposte.net\n",
"4 Bob\n",
"5 bobby@zimmerman.com\n",
"\n",
"[6 rows x 1 columns]"
"5 bobby@zimmerman.com"
]
}
],
"prompt_number": 40
"prompt_number": 3
},
{
"cell_type": "code",
Expand Down Expand Up @@ -250,25 +247,22 @@
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>5 rows \u00d7 1 columns</p>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 45,
"prompt_number": 4,
"text": [
" donn\u00e9es\n",
"0 Martin Daniel\n",
"1 martin@gmail.com\n",
"2 1234\n",
"3 0637687898\n",
"4 Bob\n",
"\n",
"[5 rows x 1 columns]"
"4 Bob"
]
}
],
"prompt_number": 45
"prompt_number": 4
},
{
"cell_type": "heading",
Expand All @@ -284,7 +278,7 @@
"input": [
"pattern1 = r'\\w'\n",
"\n",
"resultat1 = df2['donn\u00e9es'].str.match(pattern1, re.IGNORECASE)\n",
"resultat1 = df2['donn\u00e9es'].str.findall(pattern1, re.IGNORECASE)\n",
"print resultat1"
],
"language": "python",
Expand All @@ -294,16 +288,16 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"0 True\n",
"1 True\n",
"2 True\n",
"3 True\n",
"4 True\n",
"Name: donn\u00e9es, dtype: bool\n"
"0 [M, a, r, t, i, n, D, a, n, i, e, l]\n",
"1 [m, a, r, t, i, n, g, m, a, i, l, c, o, m]\n",
"2 [1, 2, 3, 4]\n",
"3 [0, 6, 3, 7, 6, 8, 7, 8, 9, 8]\n",
"4 [B, o, b]\n",
"Name: donn\u00e9es, dtype: object\n"
]
}
],
"prompt_number": 58
"prompt_number": 8
},
{
"cell_type": "markdown",
Expand All @@ -326,7 +320,7 @@
"input": [
"pattern2 = r'\\d'\n",
"\n",
"resultat2 = df2['donn\u00e9es'].str.match(pattern2, re.IGNORECASE)\n",
"resultat2 = df2['donn\u00e9es'].str.findall(pattern2, re.IGNORECASE)\n",
"print resultat2"
],
"language": "python",
Expand All @@ -336,16 +330,16 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"0 False\n",
"1 False\n",
"2 True\n",
"3 True\n",
"4 False\n",
"Name: donn\u00e9es, dtype: bool\n"
"0 []\n",
"1 []\n",
"2 [1, 2, 3, 4]\n",
"3 [0, 6, 3, 7, 6, 8, 7, 8, 9, 8]\n",
"4 []\n",
"Name: donn\u00e9es, dtype: object\n"
]
}
],
"prompt_number": 60
"prompt_number": 9
},
{
"cell_type": "markdown",
Expand All @@ -368,7 +362,7 @@
"input": [
"pattern3 = r'\\D'\n",
"\n",
"resultat3 = df2['donn\u00e9es'].str.match(pattern3, re.IGNORECASE)\n",
"resultat3 = df2['donn\u00e9es'].str.findall(pattern3, re.IGNORECASE)\n",
"print resultat3"
],
"language": "python",
Expand All @@ -378,16 +372,16 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"0 True\n",
"1 True\n",
"2 False\n",
"3 False\n",
"4 True\n",
"Name: donn\u00e9es, dtype: bool\n"
"0 [M, a, r, t, i, n, , D, a, n, i, e, l]\n",
"1 [m, a, r, t, i, n, @, g, m, a, i, l, ., c, o, m]\n",
"2 []\n",
"3 []\n",
"4 [B, o, b]\n",
"Name: donn\u00e9es, dtype: object\n"
]
}
],
"prompt_number": 62
"prompt_number": 10
},
{
"cell_type": "markdown",
Expand All @@ -410,7 +404,7 @@
"input": [
"pattern4 = r'06\\d{8}'\n",
"\n",
"resultat4 = df2['donn\u00e9es'].str.match(pattern4, re.IGNORECASE)\n",
"resultat4 = df2['donn\u00e9es'].str.findall(pattern4, re.IGNORECASE)\n",
"print resultat4"
],
"language": "python",
Expand All @@ -420,16 +414,16 @@
"output_type": "stream",
"stream": "stdout",
"text": [
"0 False\n",
"1 False\n",
"2 False\n",
"3 True\n",
"4 False\n",
"Name: donn\u00e9es, dtype: bool\n"
"0 []\n",
"1 []\n",
"2 []\n",
"3 [0637687898]\n",
"4 []\n",
"Name: donn\u00e9es, dtype: object\n"
]
}
],
"prompt_number": 69
"prompt_number": 11
},
{
"cell_type": "markdown",
Expand All @@ -450,7 +444,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Jusqu'ici nous avons utiliser la fonction .match() qui renvoie un bool\u00e9en ```True``` ou ```False```. <br>\n",
"Jusqu'ici nous avons utiliser la fonction .findall() qui renvoie la chaine correspondante. <br>\n",
"Il y a plusieurs autres fonctions tr\u00e8s utiles :"
]
},
Expand Down Expand Up @@ -507,26 +501,23 @@
" </tr>\n",
" </tbody>\n",
"</table>\n",
"<p>6 rows \u00d7 1 columns</p>\n",
"</div>"
],
"metadata": {},
"output_type": "pyout",
"prompt_number": 75,
"prompt_number": 12,
"text": [
" noms&mails\n",
"0 Martin Daniel\n",
"1 martin@gmail.com\n",
"2 Vincent Simon\n",
"3 vincent.simon@laposte.net\n",
"4 Bob\n",
"5 bobby@zimmerman.com, bobby@zimm.fr\n",
"\n",
"[6 rows x 1 columns]"
"5 bobby@zimmerman.com, bobby@zimm.fr"
]
}
],
"prompt_number": 75
"prompt_number": 12
},
{
"cell_type": "code",
Expand All @@ -537,7 +528,7 @@
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 72
"prompt_number": 13
},
{
"cell_type": "heading",
Expand Down Expand Up @@ -571,7 +562,7 @@
]
}
],
"prompt_number": 78
"prompt_number": 14
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -619,27 +610,23 @@
"input": [
"# Pour afficher ces r\u00e9sultat de fa\u00e7on plus propre, \n",
"# nous pouvons utliser la methode .str[0] pour avoir la premi\u00e8re colonne des r\u00e9sultats\n",
"resultatfindall.str[0]"
"resultatfindall.str[1]"
],
"language": "python",
"metadata": {},
"outputs": [
{
"metadata": {},
"output_type": "pyout",
"prompt_number": 96,
"text": [
"0 NaN\n",
"1 martin@gmail.com\n",
"2 NaN\n",
"3 vincent.simon@laposte.net\n",
"4 NaN\n",
"5 bobby@zimmerman.com\n",
"Name: noms&mails, dtype: object"
"ename": "NameError",
"evalue": "name 'resultatfindall' is not defined",
"output_type": "pyerr",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m\n\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"\u001b[0;32m<ipython-input-15-0f6301026d3b>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0;31m# Pour afficher ces r\u00e9sultat de fa\u00e7on plus propre,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0;31m# nous pouvons utliser la methode .str[0] pour avoir la premi\u00e8re colonne des r\u00e9sultas\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0mresultatfindall\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstr\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
"\u001b[0;31mNameError\u001b[0m: name 'resultatfindall' is not defined"
]
}
],
"prompt_number": 96
"prompt_number": 15
},
{
"cell_type": "code",
Expand Down

0 comments on commit 70cfbf6

Please sign in to comment.