Skip to content

Commit acf22ed

Browse files
committed
Merge branch 'master' of https://github.com/pytorch/pytorch.github.io into pytorch-master
2 parents db3c643 + 95cc479 commit acf22ed

File tree

681 files changed

+385941
-21844
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

681 files changed

+385941
-21844
lines changed

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
_site/
2+
Gemfile.lock

_config.yml

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,10 @@ sass:
99
sass_dir: _sass # default
1010
style: compressed
1111
safe: true
12-
highlighter: rouge
13-
markdown: kramdown
1412
future: true
1513
include:
1614
- _static
15+
- _images
1716
- _modules
1817
- _sources
1918
- _tensor_str.html
@@ -30,3 +29,14 @@ exclude:
3029

3130
plugins:
3231
- jekyll-feed
32+
33+
highlighter: rouge
34+
markdown: kramdown
35+
kramdown:
36+
# Use GitHub flavored markdown, including triple backtick fenced code blocks
37+
input: GFM
38+
# Jekyll 3 and GitHub Pages now only support rouge for syntax highlighting
39+
syntax_highlighter: rouge
40+
syntax_highlighter_opts:
41+
# Use existing pygments syntax highlighting css
42+
css_class: 'highlight'

_data/wizard.yml

Lines changed: 187 additions & 113 deletions
Large diffs are not rendered by default.

_includes/primary-nav.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@
44
<li><a {% if page.id == 'blog' %}class="active"{% endif %} href="/blog/">博客</a></li>
55
<li><a {% if page.id == 'support' %}class="active"{% endif %} href="/support/">支持</a></li>
66
<li><a {% if page.id == 'apps' %}class="active"{% endif %} href="http://bbs.pytorch.cn">论坛</a></li>
7-
</ul>
7+
/ul>

_layouts/post.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
integrity="sha384-BVYiiSIFeK1dGmJRAkycuHAHRg32OmUcww7on3RYdg4Va+PmSTsz/K68vbdEjh4u"
1212
crossorigin="anonymous">
1313
<link rel="stylesheet" href="/static/css/main.css">
14+
<link rel="stylesheet" href="/static/css/jekyll-github.css">
1415
</head>
1516
<body id="{{ page.id }}">
1617

docs/tensors.html renamed to _layouts/redirect.html

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@
22
<html lang="en-US">
33
<head>
44
<meta charset="UTF-8">
5-
<meta http-equiv="refresh" content="1; url=master/tensors.html">
5+
<meta http-equiv="refresh" content="1; url={{ page.redirect_url }}">
66
<script type="text/javascript">
7-
window.location.href = "master/tensors.html"
7+
window.location.href = "{{ page.redirect_url }}"
88
</script>
99
<title>Page Redirection</title>
1010
</head>
1111
<body>
12-
If you are not redirected automatically, follow this <a href='master/tensors.html'>link to the latest documentation</a>.
12+
If you are not redirected automatically, follow this <a href='{{ page.redirect_url }}'>link to the latest documentation</a>.
1313
<br />
1414
If you want to view documentation for a particular version, follow this <a href='versions.html'>link</a>.
1515
</body>

_posts/2017-5-11-Internals.md

Lines changed: 29 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ typedef struct {
5656
The `PyObject_HEAD` is a macro that brings in the code that implements an object's reference counting, and a pointer to the corresponding type object. So in this case, to implement a float, the only other "state" needed is the floating point value itself.
5757

5858
Now, let's see the struct for our `THPTensor` type:
59-
```
59+
```cpp
6060
struct THPTensor {
6161
PyObject_HEAD
6262
THTensor *cdata;
@@ -65,7 +65,7 @@ struct THPTensor {
6565
Pretty simple, right? We are just wrapping the underlying `TH` tensor by storing a pointer to it.
6666
6767
The key part is defining the "type object" for a new type. An example definition of a type object for our Python float takes the form:
68-
```
68+
```cpp
6969
static PyTypeObject py_FloatType = {
7070
PyVarObject_HEAD_INIT(NULL, 0)
7171
"py.FloatObject", /* tp_name */
@@ -97,16 +97,16 @@ The type object for our `THPTensor` is `THPTensorType`, defined in `csrc/generic
9797

9898
As an example, let's take a look at the `tp_new` function we set in the `PyTypeObject`:
9999

100-
```
100+
```cpp
101101
PyTypeObject THPTensorType = {
102102
PyVarObject_HEAD_INIT(NULL, 0)
103103
...
104104
THPTensor_(pynew), /* tp_new */
105105
};
106106
```
107-
The `tp_new` function enables object creation. It is responsible for creating (as opposed to initializing) objects of that type and is equivalent to the `__new()__` method at the Python level. The C implementation is a static method that is passed the type being instantiated and any arguments, and returns a newly created object.
107+
The `tp_new` function enables object creation. It is responsible for creating (as opposed to initializing) objects of that type and is equivalent to the `__new__()` method at the Python level. The C implementation is a static method that is passed the type being instantiated and any arguments, and returns a newly created object.
108108

109-
```
109+
```cpp
110110
static PyObject * THPTensor_(pynew)(PyTypeObject *type, PyObject *args, PyObject *kwargs)
111111
{
112112
HANDLE_TH_ERRORS
@@ -117,7 +117,7 @@ static PyObject * THPTensor_(pynew)(PyTypeObject *type, PyObject *args, PyObject
117117
```
118118
The first thing our new function does is allocate the `THPTensor`. It then runs through a series of initializations based off of the args passed to the function. For example, when creating a `THPTensor` *x* from another `THPTensor` *y*, we set the newly created `THPTensor`'s `cdata` field to be the result of calling `THTensor_(newWithTensor)` with the *y*'s underlying `TH` Tensor as an argument. Similar constructors exist for sizes, storages, NumPy arrays, and sequences.
119119
120-
** Note that we solely use `tp_new`, and not a combination of `tp_new` and `tp_init` (which corresponds to the `__init()__` function).
120+
** Note that we solely use `tp_new`, and not a combination of `tp_new` and `tp_init` (which corresponds to the `__init__()` function).
121121
122122
The other important thing defined in Tensor.cpp is how indexing works. PyTorch Tensors support Python's **Mapping Protocol**. This allows us to do things like:
123123
```python
@@ -136,7 +136,7 @@ The most important methods are `THPTensor_(getValue)` and `THPTensor_(setValue)`
136136
---
137137

138138
We could spend a ton of time exploring various aspects of the `THPTensor` and how it relates to defining a new Python object. But we still need to see how the `THPTensor_(init)()` function is translated to the `THPIntTensor_init()` we used in our module initialization. How do we take our `Tensor.cpp` file that defines a "generic" Tensor and use it to generate Python objects for all the permutations of types? To put it another way, `Tensor.cpp` is littered with lines of code like:
139-
```
139+
```cpp
140140
return THPTensor_(New)(THTensor_(new)(LIBRARY_STATE_NOARGS));
141141
```
142142
This illustrates both cases we need to make type-specific:
@@ -149,15 +149,15 @@ In other words, for all supported Tensor types, we need to "generate" source cod
149149
One component building an Extension module using Setuptools is to list the source files involved in the compilation. However, our `csrc/generic/Tensor.cpp` file is not listed! So how does the code in this file end up being a part of the end product?
150150

151151
Recall that we are calling the `THPTensor*` functions (such as `init`) from the directory above `generic`. If we take a look in this directory, there is another file `Tensor.cpp` defined. The last line of this file is important:
152-
```
152+
```cpp
153153
//generic_include TH torch/csrc/generic/Tensor.cpp
154154
```
155155
Note that this `Tensor.cpp` file is included in `setup.py`, but it is wrapped in a call to a Python helper function called `split_types`. This function takes as input a file, and looks for the "//generic_include" string in the file contents. If it is found, it generates a new output file for each Tensor type, with the following changes:
156156

157157
- The output file is renamed to `Tensor<Type>.cpp`
158158
- The output file is slightly modified as follows:
159159

160-
```
160+
```cpp
161161
# Before:
162162
//generic_include TH torch/csrc/generic/Tensor.cpp
163163

@@ -166,7 +166,8 @@ Note that this `Tensor.cpp` file is included in `setup.py`, but it is wrapped in
166166
#include "TH/THGenerate<Type>Type.h"
167167
```
168168
Including the header file on the second line has the side effect of including the source code in `Tensor.cpp` with some additional context defined. Let's take a look at one of the headers:
169-
```
169+
170+
```cpp
170171
#ifndef TH_GENERIC_FILE
171172
#error "You must define TH_GENERIC_FILE before including THGenerateFloatType.h"
172173
#endif
@@ -192,21 +193,22 @@ Including the header file on the second line has the side effect of including th
192193
#undef TH_GENERIC_FILE
193194
#endif
194195
```
196+
195197
What this is doing is bringing in the code from the generic `Tensor.cpp` file and surrounding it with the following macro definitions. For example, we define real as a float, so any code in the generic Tensor implementation that refers to something as a real will have that real replaced with a float. In the corresponding file `THGenerateIntType.h`, the same macro would replace `real` with `int`.
196198

197199
These output files are returned from `split_types` and added to the list of source files, so we can see how the `.cpp` code for different types is created.
198200

199-
There are a few things to note here: First, the `split_types` function is not strictly necessary. We could wrap the code in `Tensor.cpp` in a single file, repeating it for each type. The reason we split the code into separate files is to speed up compilation. Second, what we mean when we talk about the type replacement (e.g. replace real with a float) is that the C preprocessor will perform these subsitutions during compilaiton. Merely surrounding the source code with these macros has no side effects until preprocessing.
201+
There are a few things to note here: First, the `split_types` function is not strictly necessary. We could wrap the code in `Tensor.cpp` in a single file, repeating it for each type. The reason we split the code into separate files is to speed up compilation. Second, what we mean when we talk about the type replacement (e.g. replace real with a float) is that the C preprocessor will perform these substitutions during compilation. Merely surrounding the source code with these macros has no side effects until preprocessing.
200202

201203
### Generic Builds (Part Two)
202204
---
203205

204206
Now that we have source files for all the Tensor types, we need to consider how the corresponding header declarations are created, and also how the conversions from `THTensor_(method)` and `THPTensor_(method)` to `TH<Type>Tensor_method` and `THP<Type>Tensor_method` work. For example, `csrc/generic/Tensor.h` has declarations like:
205-
```
207+
```cpp
206208
THP_API PyObject * THPTensor_(New)(THTensor *ptr);
207209
```
208210
We use the same strategy for generating code in the source files for the headers. In `csrc/Tensor.h`, we do the following:
209-
```
211+
```cpp
210212
#include "generic/Tensor.h"
211213
#include <TH/THGenerateAllTypes.h>
212214
@@ -216,26 +218,28 @@ We use the same strategy for generating code in the source files for the headers
216218
This has the same effect, where we draw in the code from the generic header, wrapped with the same macro definitions, for each type. The only difference is that the resulting code is contained all within the same header file, as opposed to being split into multiple source files.
217219

218220
Lastly, we need to consider how we "convert" or "substitute" the function types. If we look in the same header file, we see a bunch of `#define` statements, including:
219-
```
221+
```cpp
220222
#define THPTensor_(NAME) TH_CONCAT_4(THP,Real,Tensor_,NAME)
221223
```
222224
This macro says that any string in the source code matching the format `THPTensor_(NAME)` should be replaced with `THPRealTensor_NAME`, where Real is derived from whatever the symbol Real is `#define`'d to be at the time. Because our header code and source code is surrounded by macro definitions for all the types as seen above, after the preprocessor has run, the resulting code is what we would expect. The code in the `TH` library defines the same macro for `THTensor_(NAME)`, supporting the translation of those functions as well. In this way, we end up with header and source files with specialized code.
223-
####Module Objects and Type Methods
225+
226+
#### Module Objects and Type Methods
227+
224228
Now we have seen how we have wrapped `TH`'s Tensor definition in `THP`, and generated THP methods such as `THPFloatTensor_init(...)`. Now we can explore what the above code actually does in terms of the module we are creating. The key line in `THPTensor_(init)` is:
225-
```
229+
```cpp
226230
# THPTensorBaseStr, THPTensorType are also macros that are specific
227231
# to each type
228232
PyModule_AddObject(module, THPTensorBaseStr, (PyObject *)&THPTensorType);
229233
```
230234
This function registers our Tensor objects to the extension module, so we can use THPFloatTensor, THPIntTensor, etc. in our Python code.
231235
232236
Just being able to create Tensors isn't very useful - we need to be able to call all the methods that `TH` defines. A simple example shows calling the in-place `zero_` method on a Tensor.
233-
```
237+
```python
234238
x = torch.FloatTensor(10)
235239
x.zero_()
236240
```
237241
Let's start by seeing how we add methods to newly defined types. One of the fields in the "type object" is `tp_methods`. This field holds an array of method definitions (`PyMethodDef`s) and is used to associate methods (and their underlying C/C++ implementations) with a type. Suppose we wanted to define a new method on our `PyFloatObject` that replaces the value. We could implement this as follows:
238-
```
242+
```cpp
239243
static PyObject * replace(PyFloatObject *self, PyObject *args) {
240244
double val;
241245
if (!PyArg_ParseTuple(args, "d", &val))
@@ -245,12 +249,12 @@ static PyObject * replace(PyFloatObject *self, PyObject *args) {
245249
}
246250
```
247251
This is equivalent to the Python method:
248-
```
252+
```python
249253
def replace(self, val):
250-
self.ob_fval = fal
254+
self.ob_fval = val
251255
```
252256
It is instructive to read more about how defining methods works in CPython. In general, methods take as the first parameter the instance of the object, and optionally parameters for the positional arguments and keyword arguments. This static function is registered as a method on our float:
253-
```
257+
```cpp
254258
static PyMethodDef float_methods[] = {
255259
{"replace", (PyCFunction)replace, METH_VARARGS,
256260
"replace the value in the float"
@@ -261,9 +265,10 @@ static PyMethodDef float_methods[] = {
261265
This registers a method called replace, which is implemented by the C function of the same name. The `METH_VARARGS` flag indicates that the method takes a tuple of arguments representing all the arguments to the function. This array is set to the `tp_methods` field of the type object, and then we can use the `replace` method on objects of that type.
262266

263267
We would like to be able to call all of the methods for `TH` tensors on our `THP` tensor equivalents. However, writing wrappers for all of the `TH` methods would be time-consuming and error prone. We need a better way to do this.
268+
264269
### PyTorch cwrap
265270
---
266-
PyTorch implements its own cwrap tool to wrap the `TH` Tensor methods for use in the Python backend. We define a `.cwrap` file containing a series of C method declarations in our custom YAML format (http://yaml.org). The cwrap tool takes this file and outputs `.cpp` source files containing the wrapped methods in a format that is compatible with our `THPTensor` Python object and the Python C extension method calling format. This tool is used to generate code to wrap not only `TH`, but also `CuDNN`. It is defined to be extensible.
271+
PyTorch implements its own cwrap tool to wrap the `TH` Tensor methods for use in the Python backend. We define a `.cwrap` file containing a series of C method declarations in our custom [YAML format](http://yaml.org). The cwrap tool takes this file and outputs `.cpp` source files containing the wrapped methods in a format that is compatible with our `THPTensor` Python object and the Python C extension method calling format. This tool is used to generate code to wrap not only `TH`, but also `CuDNN`. It is defined to be extensible.
267272

268273
An example YAML "declaration" for the in-place `addmv_` function is as follows:
269274
```
@@ -284,7 +289,7 @@ An example YAML "declaration" for the in-place `addmv_` function is as follows:
284289
```
285290
The architecture of the cwrap tool is very simple. It reads in a file, and then processes it with a series of **plugins.** See `tools/cwrap/plugins/__init__.py` for documentation on all the ways a plugin can alter the code.
286291

287-
The source code generation occurs in a series of passes. First, the YAML "declaration" is parsed and processed. Then the source code is generated piece-by-piece - adding things like argument checks and extractions, defining the method header, and the actual call to the underlying library such as `TH`. Finally, the cwrap tool allows for processing the entire file at a time. The resulting output for `addmv_` can be explored here: https://gist.github.com/killeent/c00de46c2a896335a52552604cc4d74b.
292+
The source code generation occurs in a series of passes. First, the YAML "declaration" is parsed and processed. Then the source code is generated piece-by-piece - adding things like argument checks and extractions, defining the method header, and the actual call to the underlying library such as `TH`. Finally, the cwrap tool allows for processing the entire file at a time. The resulting output for `addmv_` can be [explored here](https://gist.github.com/killeent/c00de46c2a896335a52552604cc4d74b).
288293

289294
In order to interface with the CPython backend, the tool generates an array of `PyMethodDef`s that can be stored or appended to the `THPTensor`'s `tp_methods` field.
290295

@@ -322,4 +327,4 @@ This is just a snapshot of parts of the build system for PyTorch. There is more
322327
### Resources:
323328
---
324329

325-
- https://docs.python.org/3.7/extending/index.html is invaluable for understanding how to write C/C++ Extension to Python
330+
- <https://docs.python.org/3.7/extending/index.html> is invaluable for understanding how to write C/C++ Extension to Python

0 commit comments

Comments
 (0)