Skip to content

Commit 85e1085

Browse files
add how to write good articles
1 parent e1ba94c commit 85e1085

File tree

2 files changed

+182
-24
lines changed

2 files changed

+182
-24
lines changed

contribution.md

Lines changed: 7 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Contribution Guidelines
1+
<file name=1 path=/Users/khuyentran/Data-science/contribution.md># Contribution Guidelines
22

33
## Table of Contents
44

@@ -27,32 +27,15 @@ As a writer for CodeCut, your role is to:
2727
- Maintain a tone that is approachable, confident, and helpful
2828
- Show rather than tell - use code snippets, visuals, or graphs to demonstrate your points
2929

30-
## Writing Checklist
30+
## How to Write a Good Article
3131

32-
To check off an item, replace `[ ]` with `[x]`.
32+
Good technical articles are:
3333

34-
You can check off these items directly in your IDE (such as VS Code, PyCharm, or others).
34+
- Easy to skim
35+
- Broadly helpful
36+
- Clear and concise
3537

36-
### Writing Style Checklist
37-
38-
- [ ] Use action verbs instead of passive voice
39-
- [ ] Limit paragraphs to 2 sentences. If the paragraph explains a code snippet or workflow, consider using bullet points to make it easier to follow.
40-
- [ ] For every major code block, provide a clear explanation of what it does and why it matters.
41-
- [ ] Structure content for quick scanning with clear headings and bullet points
42-
43-
### Data Science-Focused Writing Checklist
44-
45-
- [ ] Write for data scientists comfortable with Python but unfamiliar with this specific tool or library.
46-
- [ ] Use examples that align with common data science workflows or problems
47-
- [ ] Highlight **only** the features that matter to a data science audience
48-
49-
### Structure Checklist
50-
51-
- [ ] Start with a real, practical data science problem
52-
- [ ] Explain how each tool solves the problem
53-
- [ ] Use diagrams or charts to explain complex ideas, when appropriate.
54-
- [ ] Define new concepts and terminology
55-
- [ ] Only include the essential setup steps needed to run the examples. For anything beyond that, link to the official documentation.
38+
Follow the tips highlighted in [How to Write Good Technical Articles](how_to_write_good_articles.md) to write a good article.
5639

5740
## Write Article Draft
5841

how_to_write_good_articles.md

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
# How to Write Good Technical Articles
2+
3+
Articles put useful information inside other people's heads. Follow these tips to write better articles.
4+
5+
## Principles of Good Technical Articles
6+
7+
- **Easy to skim**: Few readers read linearly from top to bottom. They'll jump around, trying to assess which bit solves their problem, if any.
8+
- **Broadly helpful**: The article should be helpful to a wide range of readers, including those who are new to the topic.
9+
- **Clear and concise**: The article should be clear and concise, with a focus on the most important information.
10+
11+
## General Tips
12+
13+
### Know Your Audience
14+
15+
Understanding your audience is crucial for effective technical writing. Before writing, consider:
16+
17+
- Their technical background and experience level
18+
- What problems they're trying to solve
19+
- What information they need to succeed
20+
21+
For CodeCut articles, we write for data scientists who:
22+
23+
- Are proficient in Python
24+
- Need to learn new tools quickly
25+
- Want practical, working examples
26+
27+
Focus on delivering exactly what they need - no more, no less. Cut any content that doesn't directly help them solve their problem.
28+
29+
30+
### Use Action Verbs
31+
32+
Use action verbs instead of passive voice.
33+
34+
❌ Don't:
35+
```
36+
SQL operations on DataFrames are provided by DuckDB without server setup.
37+
```
38+
39+
✅ Do:
40+
```
41+
DuckDB provides SQL operations on DataFrames without server setup.
42+
```
43+
44+
### Don't Tell, Show
45+
46+
Graphics and code snippets are more effective than text. Whenever possible, use them to explain your points instead of lengthy paragraphs.
47+
48+
❌ Don't: Use only text to explain.
49+
50+
```
51+
DuckDB is a fast, in-process SQL OLAP database management system. It supports standard SQL queries on Parquet and CSV files, and provides seamless integration with Python DataFrames. DuckDB is useful for analytics workloads on local data without needing a server.
52+
```
53+
54+
✅ Do: Use both text and code snippets to explain.
55+
56+
```
57+
You can query a Parquet file directly using DuckDB with a single line of SQL:
58+
59+
```python
60+
import duckdb
61+
62+
duckdb.query("SELECT COUNT(*) FROM 'data.parquet'").show()
63+
```
64+
65+
This runs a SQL query on a local Parquet file without needing to load it into memory first.
66+
```
67+
68+
### Keep Paragraphs Short
69+
70+
Keep paragraphs short and focused. Opt for short paragraphs over long paragraphs that deliver the same information.
71+
72+
For step-by-step instructions, use bullet points instead of paragraphs to improve readability and make the sequence clear.
73+
74+
❌ Don't: Use long paragraphs.
75+
```
76+
Feature selection is an essential part of the machine learning pipeline, and depending on the data type, domain knowledge, and modeling goals, different methods such as mutual information, recursive feature elimination, or embedded methods can be used to improve model performance and interpretability.
77+
```
78+
79+
✅ Do: Use short paragraphs.
80+
```
81+
Feature selection improves model performance and interpretability by removing irrelevant variables.
82+
```
83+
84+
### Begin Sections with Self-Contained Preview
85+
86+
When readers skim, they focus on the first word, line, and sentence of each section. Start sections with sentences that make sense on their own, without relying on earlier content.
87+
88+
❌ Don't:
89+
```
90+
With the previous steps completed, let's now explore hyperparameter tuning.
91+
```
92+
93+
✅ Do:
94+
```
95+
Hyperparameter tuning is a crucial step to improve model performance after initial training.
96+
```
97+
98+
99+
### Avoid Left-Branching Sentences
100+
101+
Avoid left-branching sentences as they force readers to hold information in memory until the end, which can be especially taxing.
102+
103+
❌ Don't: Use left-branching sentences.
104+
105+
```
106+
You need historical sales data, holiday indicators, weather variables, and promotional events to build an accurate time series forecast.
107+
```
108+
109+
✅ Do: Use right-branching sentences.
110+
111+
```
112+
To build an accurate time series forecast, you need historical sales data, holiday indicators, weather variables, and promotional events.
113+
```
114+
115+
116+
### Be Consistent
117+
118+
Be consistent with formatting and naming: if you use Title Case, use it everywhere.
119+
120+
### Don't Tell Readers What They Think or What to Do
121+
122+
Avoid telling the reader what they think or what to do. This can annoy readers or undermine credibility.
123+
124+
❌ Don't: Presume the reader's thoughts or intentions.
125+
126+
```
127+
Now you probably want to understand how to apply quantile forecasts.
128+
```
129+
130+
✅ Do: Use neutral, direct phrasing.
131+
132+
```
133+
To apply quantile forecasts, …
134+
```
135+
136+
### Explain Things Simply
137+
138+
Explain things more simply than you think you need to. Many readers might not speak English as a first language. Many readers might be really confused about technical terminology and have little excess brainpower to spend on parsing English sentences.
139+
140+
❌ Don't: Use complex language.
141+
142+
```
143+
DuckDB implements ACID-compliant transaction management mechanisms. The following elucidates the fundamental properties of ACID transactions:
144+
145+
- **Atomicity**: The transaction execution adheres to an all-or-nothing paradigm, wherein the entire sequence of operations must either culminate in successful completion or result in a complete rollback to the initial state, ensuring data integrity through transactional boundaries.
146+
147+
- **Consistency**: The database system enforces a comprehensive set of integrity constraints and business rules throughout the transaction lifecycle, maintaining a valid and coherent state that satisfies all predefined invariants and validation criteria.
148+
149+
- **Isolation**: Concurrent transactions operate within distinct execution contexts, preventing any form of interference or data corruption through sophisticated concurrency control mechanisms that maintain transaction independence.
150+
151+
- **Durability**: Once a transaction reaches the committed state, its effects become permanently persisted in the database, withstanding any subsequent system failures, crashes, or power outages through robust persistence mechanisms.
152+
```
153+
154+
✅ Do: Use simple language.
155+
156+
```
157+
DuckDB supports ACID transactions on your data. Here are the properties of ACID transactions:
158+
159+
- **Atomicity**: The transaction either completes entirely or has no effect at all. If any operation fails, all changes are rolled back.
160+
- **Consistency**: The database maintains valid data by enforcing all rules and constraints throughout the transaction.
161+
- **Isolation**: Transactions run independently without interfering with each other
162+
- **Durability**: Committed changes are permanent and survive system failures
163+
```
164+
165+
### Avoid Abbreviations
166+
167+
Write things out. The cost to experts is low and the benefit to beginners is high.
168+
169+
❌ Don't: Use abbreviations.
170+
171+
"RAG"
172+
173+
✅ Do: Write things out.
174+
175+
"Retrieval-augmented generation"

0 commit comments

Comments
 (0)