Skip to content

Commit 7929bd4

Browse files
committed
build a simple google translate example around simple_report.py (NO_JIRA)
1 parent 51ff6c8 commit 7929bd4

File tree

1 file changed

+378
-0
lines changed

1 file changed

+378
-0
lines changed
Lines changed: 378 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,378 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"```\n",
8+
"This script can be used for any purpose without limitation subject to the\n",
9+
"conditions at http://www.ccdc.cam.ac.uk/Community/Pages/Licences/v2.aspx\n",
10+
"\n",
11+
"This permission notice and the following statement of attribution must be\n",
12+
"included in all copies or substantial portions of this script.\n",
13+
"\n",
14+
"2024-09-11: Made available by the Cambridge Crystallographic Data Centre.\n",
15+
"\n",
16+
"```"
17+
]
18+
},
19+
{
20+
"cell_type": "markdown",
21+
"metadata": {},
22+
"source": [
23+
"# International Report\n",
24+
"\n",
25+
"This notebook talks through how to create a report that reports a CSD entry ... but uses google translate to translate headings into the language of user choice.\n",
26+
"You could use the ideas her to convert the 'simple_report.py script into an international report\n",
27+
"\n",
28+
"#### Prerequisites\n",
29+
"\n",
30+
"First install the CSD Python API and googletrans into your conda or pip environment. This should only be needed the before you run the script."
31+
]
32+
},
33+
{
34+
"cell_type": "markdown",
35+
"metadata": {},
36+
"source": [
37+
"Now we are ready to start some python coding. We need to import some modules. We are writing a script that ultimately will be used in Mercury so we will want some of \n",
38+
"the special utilities available for writing reports."
39+
]
40+
},
41+
{
42+
"cell_type": "code",
43+
"execution_count": 1,
44+
"metadata": {},
45+
"outputs": [],
46+
"source": [
47+
"from ccdc.utilities import html_table\n"
48+
]
49+
},
50+
{
51+
"cell_type": "markdown",
52+
"metadata": {},
53+
"source": [
54+
"We can create a translator for the script to use, and define our language of choice. Lets write a little function to do the translation"
55+
]
56+
},
57+
{
58+
"cell_type": "code",
59+
"execution_count": 2,
60+
"metadata": {},
61+
"outputs": [],
62+
"source": [
63+
"from googletrans import Translator\n",
64+
"\n",
65+
"translator = Translator()\n",
66+
"def tr(text, lang):\n",
67+
" if lang is None:\n",
68+
" return text\n",
69+
" try:\n",
70+
" return translator.translate(str(text),src=\"en\",dest=lang).text\n",
71+
" except Exception as e:\n",
72+
" return text"
73+
]
74+
},
75+
{
76+
"cell_type": "markdown",
77+
"metadata": {},
78+
"source": [
79+
"Let's try it out:"
80+
]
81+
},
82+
{
83+
"cell_type": "code",
84+
"execution_count": 3,
85+
"metadata": {},
86+
"outputs": [
87+
{
88+
"data": {
89+
"text/plain": [
90+
"'ماري لديها خروف صغير'"
91+
]
92+
},
93+
"execution_count": 3,
94+
"metadata": {},
95+
"output_type": "execute_result"
96+
}
97+
],
98+
"source": [
99+
"\n",
100+
"tr(\"Mary had a little lamb\", \"ar\") # Lets try arabic!"
101+
]
102+
},
103+
{
104+
"cell_type": "markdown",
105+
"metadata": {},
106+
"source": [
107+
"So now we can create a report on a CSD entry that is international. The following function will work in a Mercury API script. Note how it defines an interface object"
108+
]
109+
},
110+
{
111+
"cell_type": "code",
112+
"execution_count": 4,
113+
"metadata": {},
114+
"outputs": [],
115+
"source": [
116+
"def get_coordinates(molecule, round_digits=None):\n",
117+
" \"\"\"Yield the label and fractional coordinates of all atoms in the molecule.\n",
118+
"\n",
119+
" :param molecule: (:obj:`ccdc.molecule.Molecule`) The molecule for which to return coordinates.\n",
120+
" :param round_digits: (:obj:`int`) How many decimal digits coordinates should be rounded to.\n",
121+
" :returns: (:obj:`list`) List of the label and fractional x/y/z coordinates for each atom\n",
122+
" in the molecule in format ['Atom label', 'X coordinate', 'Y coordinate', 'Z coordinate'].\n",
123+
" \"\"\"\n",
124+
" for atom in molecule.atoms:\n",
125+
" try:\n",
126+
" x, y, z = atom.fractional_coordinates\n",
127+
" yield [atom.label,\n",
128+
" x if round_digits is None else round(x, round_digits),\n",
129+
" y if round_digits is None else round(y, round_digits),\n",
130+
" z if round_digits is None else round(z, round_digits)]\n",
131+
" except TypeError:\n",
132+
" continue\n",
133+
"\n",
134+
"\n",
135+
"def main(interface=None, lang=None):\n",
136+
" \"\"\"Generate a simple report based on the entry currently selected in the Mercury UI.\n",
137+
"\n",
138+
" :param interface: (:obj:`ccdc.utility.ApplicationInterface`) An ApplicationInterface instance.\n",
139+
" \"\"\"\n",
140+
" if interface is None:\n",
141+
" from ccdc.utilities import ApplicationInterface\n",
142+
" interface = ApplicationInterface()\n",
143+
"\n",
144+
" entry = interface.current_entry\n",
145+
"\n",
146+
" # Open a HTML report. This will create the file, copy the CSD Python API\n",
147+
" # default template for reports and fill in the headline/page title.\n",
148+
" with interface.html_report(title=tr('Simple report on ',lang)+ ' ' + entry.identifier) as report:\n",
149+
"\n",
150+
" # Write the section header for Entry Details\n",
151+
" report.write_section_header(tr('Entry Details',lang))\n",
152+
" # Assemble a list of information and labels to go into the \"Entry Details\" table\n",
153+
" entry_details = [\n",
154+
" ['<b>'+tr('Chemical name',lang)+'</b>', entry.chemical_name],\n",
155+
" ['<b>'+tr('Synonyms',lang)+'</b>', entry.synonyms],\n",
156+
" ['<b>'+tr('Formula',lang)+'</b>', entry.formula],\n",
157+
" ['<b>'+tr('R-factor',lang)+'</b>', entry.r_factor],\n",
158+
" ['<b>'+tr('Disorder',lang)+'</b>', tr(entry.disorder_details,lang)],\n",
159+
" ['<b>'+tr('Polymorphism',lang)+'</b>', tr(entry.polymorph,lang)],\n",
160+
" ['<b>'+tr('3D structure',lang)+'</b>', tr(entry.has_3d_structure,lang)],\n",
161+
" ['<b>'+tr('Organic',lang)+'</b>', tr(entry.is_organic,lang)],\n",
162+
" ['<b>'+tr('Polymeric',lang)+'</b>', tr(entry.is_polymeric,lang)],\n",
163+
" ['<b>'+tr('Bioactivity',lang)+'</b>', tr(entry.bioactivity,lang)],\n",
164+
" ['<b>'+tr('Source',lang)+'</b>', tr(entry.source,lang)],\n",
165+
" ['<b>'+tr('Habit',lang)+'</b>', tr(entry.habit,lang)],\n",
166+
" ]\n",
167+
" # Generate a HTML table from the entry details and write it to the report\n",
168+
" report.write(html_table(data=entry_details, table_id='entry_details'))\n",
169+
"\n",
170+
" # Write the section header for Fractional Coordinates\n",
171+
" report.write_section_header(tr('Fractional Coordinates',lang))\n",
172+
" # Get the coordinates of all the atoms from the entry and write them to a HTML table\n",
173+
" report.write(html_table(data=list(get_coordinates(entry.molecule, round_digits=3)),\n",
174+
" table_id='fractional_coordinates',\n",
175+
" header=[tr('Atom',lang), 'x', 'y', 'z']))\n",
176+
"\n",
177+
" # Write the section header for Publication Details\n",
178+
" report.write_section_header(tr('Publication Details',lang))\n",
179+
" # Assemble a list of information and labels for the Publication Details table\n",
180+
" publication_details = [\n",
181+
" ['<b>'+tr('Reference',lang)+'</b>', '%s Volume %s, %s' % (getattr(entry.publication, 'journal_name', ''),\n",
182+
" entry.publication.volume,\n",
183+
" entry.publication.year)],\n",
184+
" ['<b>'+tr('Authors', lang)+'</b>', entry.publication.authors],\n",
185+
" ['<b>'+tr('Document Object Identifier',lang)+'</b>', entry.publication.doi]\n",
186+
" ]\n",
187+
" # Write the publication details to a HTML table\n",
188+
" report.write(html_table(publication_details, table_id='publication_details'))\n",
189+
"\n",
190+
" # Write the section header for Basic Crystallographic Information\n",
191+
" report.write_section_header(tr('Basic Crystallographic Information',lang))\n",
192+
" # Assemble a list of basic crystal information and labels for the table\n",
193+
" crystallographic_data = [\n",
194+
" ['<b>'+tr('Crystal System',lang)+'</b>', entry.crystal.crystal_system],\n",
195+
" ['<b>'+tr('Space Group',lang) + '</b>', entry.crystal.spacegroup_symbol],\n",
196+
" ['<b>'+tr('Cell Volume',lang)+ '</b>', '%s ų' % round(entry.crystal.cell_volume, 3)],\n",
197+
" ['<b>Z, Z\\'</b>', entry.crystal.z_prime],\n",
198+
" ]\n",
199+
" # Write the crystallographic details to a HTML table\n",
200+
" report.write(html_table(crystallographic_data, table_id='crystallographic_information'))\n",
201+
"\n",
202+
" # Once the HTMLReport is closed (e.g. when the with: branch above ends),\n",
203+
" # it will automatically write the appropriate HTML footer."
204+
]
205+
},
206+
{
207+
"cell_type": "markdown",
208+
"metadata": {},
209+
"source": [
210+
"Lets use our function. As we are in a notebook, we will have to define a dummy interface file:"
211+
]
212+
},
213+
{
214+
"cell_type": "code",
215+
"execution_count": 5,
216+
"metadata": {},
217+
"outputs": [],
218+
"source": [
219+
"import time\n",
220+
"from ccdc.utilities import ApplicationInterface\n",
221+
"interface = ApplicationInterface(parse_commandline=False)\n",
222+
"interface.identifier = \"AABHTZ\"\n",
223+
"interface.output_html_file = f'{interface.identifier}_{time.strftime(\"%H%M%S\", time.gmtime())}_report.html'\n",
224+
"\n",
225+
"main(interface,\"ar\") # Arabic "
226+
]
227+
},
228+
{
229+
"cell_type": "markdown",
230+
"metadata": {},
231+
"source": [
232+
"Finally use Jupyter to display it:"
233+
]
234+
},
235+
{
236+
"cell_type": "code",
237+
"execution_count": 6,
238+
"metadata": {},
239+
"outputs": [
240+
{
241+
"data": {
242+
"text/html": [
243+
"<!DOCTYPE html>\n",
244+
"<html>\n",
245+
"\t<head>\n",
246+
"\t\t<style type=\"text/css\">\n",
247+
"\t\t\tbody { font-family: Calibri, Verdana, sans-serif; padding: 1.5em; font-size: 1.2em}\n",
248+
"\t\t\t#ccdc_logo { float: right; margin: -2em 1.5em 1.5em 1.5em; }\n",
249+
"\t\t\tp { text-align: justify; }\n",
250+
"\t\t\th1 { text-align: left; font-size: 1.8em; font-weight: bold; }\n",
251+
"\t\t\th2 { text-align: left; font-size: 1.5em; font-weight: bold; }\n",
252+
"\t\t\ttable { width: 100%; margin:auto; border: 1px solid; border-collapse:collapse; }\n",
253+
"\t\t\tth { padding: .2em; font-weight: bold; }\n",
254+
"\t\t\ttd { padding: .2em; }\n",
255+
"\t\t\ttable, th, td { border: 1px solid; }\n",
256+
"\t\t\ttr:nth-child(even) { background: #ccc; }\n",
257+
" </style>\n",
258+
"\t\t<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\"/>\n",
259+
"\t\t<title>تقرير بسيط عن AABHTZ</title>\n",
260+
" </head>\n",
261+
"\t<body>\n",
262+
"<img src=\"ccdc_logo_180x180_with_text.png\" id=\"ccdc_logo\" alt=\"CCDC\" />\n",
263+
"<h1 id=\"report_header\">تقرير بسيط عن AABHTZ</h1>\n",
264+
"<h2 id=\"\">تفاصيل الدخول</h2>\n",
265+
"<table id=\"entry_details\">\n",
266+
" <tr><td><b>الاسم الكيميائي</b></td><td>4-Acetoamido-3-(1-acetyl-2-(2,6-dichlorobenzylidene)hydrazine)-1,2,4-triazole</td></tr>\n",
267+
" <tr><td><b>المرادفات</b></td><td>()</td></tr>\n",
268+
" <tr><td><b>صيغة</b></td><td>C13 H12 Cl2 N6 O2</td></tr>\n",
269+
" <tr><td><b>عامل r</b></td><td>4.1</td></tr>\n",
270+
" <tr><td><b>اضطراب</b></td><td>لا أحد</td></tr>\n",
271+
" <tr><td><b>تعدد الأشكال</b></td><td>لا أحد</td></tr>\n",
272+
" <tr><td><b>هيكل ثلاثي الأبعاد</b></td><td>حقيقي</td></tr>\n",
273+
" <tr><td><b>عضوي</b></td><td>حقيقي</td></tr>\n",
274+
" <tr><td><b>البوليمرية</b></td><td>خطأ شنيع</td></tr>\n",
275+
" <tr><td><b>النشاط الحيوي</b></td><td>لا أحد</td></tr>\n",
276+
" <tr><td><b>مصدر</b></td><td>لا أحد</td></tr>\n",
277+
" <tr><td><b>عادة</b></td><td>لا أحد</td></tr>\n",
278+
"</table>\n",
279+
"<h2 id=\"\">إحداثيات كسرية</h2>\n",
280+
"<table id=\"fractional_coordinates\">\n",
281+
" <tr><th>ذرة</th><th>x</th><th>y</th><th>z</th></tr>\n",
282+
" <tr><td>Cl1</td><td>-0.336</td><td>0.1</td><td>0.106</td></tr>\n",
283+
" <tr><td>Cl2</td><td>-0.641</td><td>-0.308</td><td>0.327</td></tr>\n",
284+
" <tr><td>C1</td><td>-0.478</td><td>0.039</td><td>0.231</td></tr>\n",
285+
" <tr><td>C2</td><td>-0.573</td><td>0.134</td><td>0.342</td></tr>\n",
286+
" <tr><td>C3</td><td>-0.685</td><td>0.092</td><td>0.448</td></tr>\n",
287+
" <tr><td>C4</td><td>-0.702</td><td>-0.045</td><td>0.445</td></tr>\n",
288+
" <tr><td>C5</td><td>-0.607</td><td>-0.139</td><td>0.33</td></tr>\n",
289+
" <tr><td>C6</td><td>-0.49</td><td>-0.101</td><td>0.218</td></tr>\n",
290+
" <tr><td>C7</td><td>-0.384</td><td>-0.194</td><td>0.089</td></tr>\n",
291+
" <tr><td>N1</td><td>-0.367</td><td>-0.298</td><td>0.135</td></tr>\n",
292+
" <tr><td>N2</td><td>-0.263</td><td>-0.38</td><td>0.01</td></tr>\n",
293+
" <tr><td>C8</td><td>-0.186</td><td>-0.365</td><td>-0.175</td></tr>\n",
294+
" <tr><td>N3</td><td>-0.22</td><td>-0.386</td><td>-0.335</td></tr>\n",
295+
" <tr><td>N4</td><td>-0.115</td><td>-0.36</td><td>-0.484</td></tr>\n",
296+
" <tr><td>C9</td><td>-0.025</td><td>-0.326</td><td>-0.406</td></tr>\n",
297+
" <tr><td>N5</td><td>-0.064</td><td>-0.327</td><td>-0.21</td></tr>\n",
298+
" <tr><td>N6</td><td>0.002</td><td>-0.284</td><td>-0.074</td></tr>\n",
299+
" <tr><td>C10</td><td>-0.031</td><td>-0.161</td><td>0.074</td></tr>\n",
300+
" <tr><td>C11</td><td>0.041</td><td>-0.124</td><td>0.222</td></tr>\n",
301+
" <tr><td>O1</td><td>-0.11</td><td>-0.088</td><td>0.079</td></tr>\n",
302+
" <tr><td>C12</td><td>-0.239</td><td>-0.486</td><td>0.072</td></tr>\n",
303+
" <tr><td>C13</td><td>-0.331</td><td>-0.509</td><td>0.255</td></tr>\n",
304+
" <tr><td>O2</td><td>-0.145</td><td>-0.554</td><td>-0.03</td></tr>\n",
305+
" <tr><td>H1</td><td>-0.558</td><td>0.232</td><td>0.347</td></tr>\n",
306+
" <tr><td>H2</td><td>-0.752</td><td>0.157</td><td>0.531</td></tr>\n",
307+
" <tr><td>H3</td><td>-0.784</td><td>-0.078</td><td>0.53</td></tr>\n",
308+
" <tr><td>H4</td><td>-0.326</td><td>-0.175</td><td>-0.032</td></tr>\n",
309+
" <tr><td>H5</td><td>0.057</td><td>-0.296</td><td>-0.469</td></tr>\n",
310+
" <tr><td>H6</td><td>0.046</td><td>-0.34</td><td>-0.057</td></tr>\n",
311+
" <tr><td>H7</td><td>0.081</td><td>-0.036</td><td>0.217</td></tr>\n",
312+
" <tr><td>H8</td><td>0.105</td><td>-0.198</td><td>0.189</td></tr>\n",
313+
" <tr><td>H9</td><td>-0.006</td><td>-0.107</td><td>0.333</td></tr>\n",
314+
" <tr><td>H10</td><td>-0.313</td><td>-0.598</td><td>0.275</td></tr>\n",
315+
" <tr><td>H11</td><td>-0.329</td><td>-0.451</td><td>0.374</td></tr>\n",
316+
" <tr><td>H12</td><td>-0.413</td><td>-0.525</td><td>0.243</td></tr>\n",
317+
"</table>\n",
318+
"<h2 id=\"\">تفاصيل النشر</h2>\n",
319+
"<table id=\"publication_details\">\n",
320+
" <tr><td><b>مرجع</b></td><td> Volume 5, 1976</td></tr>\n",
321+
" <tr><td><b>المؤلفون</b></td><td>P.-E.Werner</td></tr>\n",
322+
" <tr><td><b>معرف كائن المستند</b></td><td>None</td></tr>\n",
323+
"</table>\n",
324+
"<h2 id=\"\">المعلومات البلورية الأساسية</h2>\n",
325+
"<table id=\"crystallographic_information\">\n",
326+
" <tr><td><b>نظام البلورة</b></td><td>triclinic</td></tr>\n",
327+
" <tr><td><b>مجموعة الفضاء</b></td><td>P-1</td></tr>\n",
328+
" <tr><td><b>حجم الخلية</b></td><td>769.978 ų</td></tr>\n",
329+
" <tr><td><b>Z, Z'</b></td><td>1.0</td></tr>\n",
330+
"</table>\n",
331+
"\n",
332+
"</body>\n",
333+
"</html>\n"
334+
],
335+
"text/plain": [
336+
"<IPython.core.display.HTML object>"
337+
]
338+
},
339+
"execution_count": 6,
340+
"metadata": {},
341+
"output_type": "execute_result"
342+
}
343+
],
344+
"source": [
345+
"from IPython.display import HTML\n",
346+
"HTML(interface.output_html_file)"
347+
]
348+
},
349+
{
350+
"cell_type": "code",
351+
"execution_count": null,
352+
"metadata": {},
353+
"outputs": [],
354+
"source": []
355+
}
356+
],
357+
"metadata": {
358+
"kernelspec": {
359+
"display_name": "Python 3 (ipykernel)",
360+
"language": "python",
361+
"name": "python3"
362+
},
363+
"language_info": {
364+
"codemirror_mode": {
365+
"name": "ipython",
366+
"version": 3
367+
},
368+
"file_extension": ".py",
369+
"mimetype": "text/x-python",
370+
"name": "python",
371+
"nbconvert_exporter": "python",
372+
"pygments_lexer": "ipython3",
373+
"version": "3.9.20"
374+
}
375+
},
376+
"nbformat": 4,
377+
"nbformat_minor": 4
378+
}

0 commit comments

Comments
 (0)