Skip to content

Commit fdf18d9

Browse files
committed
Restructure README file
1 parent 8291aaa commit fdf18d9

8 files changed

+323
-331
lines changed

README.md

+6-331
Large diffs are not rendered by default.
File renamed without changes.

docs/introduction_to_docker.md

+23
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
### Introduction to Docker.
2+
3+
- Building the Docker image for Worker using python:3.10.2-alpine3.15
4+
5+
* ***docker***
6+
7+
Docker is a container management service. The keywords of Docker are develop, ship and run anywhere. The whole idea of Docker is for developers to easily develop applications, ship them into containers which can then be deployed anywhere.
8+
* ***Images***
9+
10+
Docker images are read-only templates with instructions to create a docker container. Docker image can be pulled from a Docker hub and used as it is, or you can add additional instructions to the base image and create a new and modified docker image. You can create your own docker images also using a dockerfile. Create a dockerfile with all the instructions to create a container and run it; it will create your custom docker image.
11+
* ***To create docker image for python:3.10.2-alpine3.15***
12+
13+
1)Create a dockerfile
14+
15+
FROM python:3.10.2-alpine3.15
16+
# Create directories
17+
RUN mkdir -p /root/workspace/src
18+
# Switch to project directory
19+
WORKDIR /root/workspace/src
20+
21+
2)Goto the directory where you created Dockerfile
22+
23+
docker build ./ -t Simple_python

docs/introduction_to_git_commands.md

+49
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
### Introduction to Github.
2+
- **Setting up github**.
3+
4+
Make a repository in GitHub
5+
6+
Go to GitHub.com and login.
7+
Click the green “New Repository” button
8+
Repository name: myrepo
9+
Public
10+
Check Initialize this repository with a README
11+
Click the green “Create repository” button
12+
Copy the HTTPS clone URL to your clipboard via the green “Clone or Download” button.
13+
Clone the repository to your computer
14+
15+
git clone https://github.com/YOUR-USERNAME/YOUR-REPOSITORY.git
16+
17+
Make a local change, commit, and push
18+
19+
git add <file path> //here the file path is which file you modified ready the file for commit
20+
git commit -m "A commit from my local computer" //here you commit the changes
21+
git push origin <brachname> //here you push the changes to your remote repository and brach name is in which brach you pushing this
22+
23+
## General git flow:
24+
![git flow](gitflow.png)
25+
26+
- **Basic commands of git**
27+
28+
* **git init**
29+
30+
he command git init is used to create an empty Git repository.
31+
32+
* **git add**
33+
34+
Add command is used after checking the status of the files, to add t hose files to the staging area.
35+
Before running the commit command, "git add" is used to add any new or modified files.
36+
37+
* **git commit**
38+
39+
The commit command makes sure that the changes are saved to the local repository.
40+
The command "git commit –m <message>" allows you to describe everyone and help them understand what has happened.
41+
* **git status**
42+
43+
The git status command tells the current state of the repository.
44+
45+
The command provides the current working branch. If the files are in the staging area, but not committed, it will be shown by the git status. Also, if there are no changes, it will show the message no changes to commit, working directory clean.
46+
* **git config**
47+
48+
The git config command is used initially to configure the user.name and user.email. This specifies what email id and username will be used from a local repository.
49+

docs/introduction_to_postgresql.md

+80
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
### Introduction to PostgreSQL.
2+
- **Key Features of PostgreSQL**.
3+
- Free to download
4+
- Compatible with Data Integrity
5+
- Compatible with multiple data types
6+
- Highly extensible
7+
- Secure
8+
- Highly Reliable:
9+
10+
- **JOINS**.
11+
- The CROSS JOIN
12+
- The INNER JOIN
13+
- The LEFT OUTER JOIN
14+
- The RIGHT OUTER JOIN
15+
- The FULL OUTER JOIN
16+
![Pictorial Representation of JOINS](https://i.stack.imgur.com/4zjxm.png)
17+
18+
**The CROSS JOIN**
19+
20+
A CROSS JOIN matches every row of the first table with every row of the second table
21+
SELECT ... FROM table1 CROSS JOIN table2 …
22+
SELECT EMP_ID, NAME, DEPT FROM COMPANY CROSS JOIN DEPARTMENT;
23+
24+
**The INNER JOIN**
25+
26+
A INNER JOIN creates a new result table by combining column values of two tables (table1 and table2) based upon the join-predicate. The query compares each row of table1 with each row of table2 to find all pairs of rows, which satisfy the join-predicate. When the join-predicate is satisfied, column values for each matched pair of rows of table1 and table2 are combined into a result row.
27+
SELECT table1.column1, table2.column2...
28+
FROM table1
29+
INNER JOIN table2
30+
ON table1.common_filed = table2.common_field;
31+
32+
SELECT EMP_ID, NAME, DEPT FROM COMPANY INNER JOIN DEPARTMENT ON COMPANY.ID = DEPARTMENT.EMP_ID;
33+
34+
**The LEFT OUTER JOIN**
35+
36+
The OUTER JOIN is an extension of the INNER JOIN. SQL standard defines three types of OUTER JOINs: LEFT, RIGHT, and FULL and PostgreSQL supports all of these.
37+
In case of LEFT OUTER JOIN, an inner join is performed first. Then, for each row in table T1 that does not satisfy the join condition with any row in table T2, a joined row is added with null values in columns of T2. Thus, the joined table always has at least one row for each row in T1.
38+
39+
SELECT ... FROM table1 LEFT OUTER JOIN table2 ON conditional_expression ...
40+
41+
SELECT EMP_ID, NAME, DEPT FROM COMPANY LEFT OUTER JOIN DEPARTMENT
42+
ON COMPANY.ID = DEPARTMENT.EMP_ID;
43+
44+
**The RIGHT OUTER JOIN**
45+
46+
First, an inner join is performed. Then, for each row in table T2 that does not satisfy the join condition with any row in table T1, a joined row is added with null values in columns of T1. This is the converse of a left join; the result table will always have a row for each row in T2.
47+
SELECT ... FROM table1 RIGHT OUTER JOIN table2 ON conditional_expression ...
48+
49+
SELECT EMP_ID, NAME, DEPT FROM COMPANY RIGHT OUTER JOIN DEPARTMENT ON COMPANY.ID = DEPARTMENT.EMP_ID;
50+
51+
**The FULL OUTER JOIN**
52+
53+
First, an inner join is performed. Then, for each row in table T1 that does not satisfy the join condition with any row in table T2, a joined row is added with null values in columns of T2. In addition, for each row of T2 that does not satisfy the join condition with any row in T1, a joined row with null values in the columns of T1 is added.
54+
The following is the syntax of FULL OUTER JOIN −
55+
SELECT ... FROM table1 FULL OUTER JOIN table2 ON conditional_expression ...
56+
Based on the above tables, we can write an inner join as follows −
57+
SELECT EMP_ID, NAME, DEPT FROM COMPANY FULL OUTER JOIN DEPARTMENT ON COMPANY.ID = DEPARTMENT.EMP_ID;
58+
- **Things to Note**
59+
60+
- You can do select with limit
61+
- You are able to do group by, order by ,having clauses, etc.
62+
- Your not able to limit delete and update directly. You need to use inner query
63+
> delete from student where sid in (select id from table limit 10)
64+
65+
> update from student set city=”mangalore”where sid in (select id from table limit 10)
66+
67+
- **Update the existing docker image to support PostgreSQL**
68+
69+
FROM python:3.10.2-alpine3.15
70+
RUN apk update
71+
RUN apk add postgresql
72+
RUN chown postgres:postgres /run/postgresql/
73+
# Create directories
74+
RUN mkdir -p /root/workspace/src
75+
COPY ./web_scraping_sample.py /root/workspace/src
76+
# Switch to project directory
77+
WORKDIR /root/workspace/src
78+
79+
Goto the directory where you created Dockerfile
80+
Docker build -t simple_python

docs/introduction_to_webscraping.md

+127
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
### Introduction to Webscraping.
2+
- **Beautifulsoup**
3+
- *Introduction*
4+
Beautiful Soup is a python package which allows us to pull data out of HTML and XML documents.
5+
- *Beautiful Soup - Installation*
6+
7+
pip install beautifulsoup4
8+
- *Import beautifulsoup*
9+
10+
from bs4 import BeautifulSoup
11+
- *Important Methods*
12+
13+
1) **find**(name, attrs, recursive, string, **kwargs)
14+
15+
scan the entire document to find only one result.
16+
17+
2) **find_all**(name, attrs, recursive, string, limit, **kwargs)
18+
19+
You can use find_all to extract all the occurrences of a particular tag from the page response as
20+
21+
- **Regex**
22+
23+
***Introduction***
24+
25+
The Python module re provides full support for Perl-like regular expressions in Python
26+
27+
***Important methods***
28+
29+
* **re.match**(pattern, string, flags=0)
30+
31+
The re.match function returns a match object on success, None on failure. We usegroup(num) or groups() function of the match object to get a matched expression
32+
33+
* **re.search**(pattern, string, flags=0)
34+
35+
The search() function searches the string for a match, and returns a Match object if there is a match.
36+
37+
* **re.findall**(pattern, string, flags=0))
38+
39+
function returns a list containing all matches.
40+
41+
* **re.sub**(pattern,replace_string,string)
42+
43+
The sub() function replaces the matches with the text of your choice:
44+
45+
***Metacharacters***
46+
47+
[] a set of a character
48+
. any character
49+
^ start with
50+
$ end with
51+
* zero or more occurrences
52+
+ one or more occurrences
53+
? zero or one occurrences
54+
{} exactly specified number of occurrence
55+
() capture a group
56+
*Important Special Sequences*
57+
\w Matches word characters.
58+
\W Matches nonword characters.
59+
\s Matches whitespace. Equivalent to [\t\n\r\f].
60+
\S Matches nonwhitespace
61+
\d Matches digits. Equivalent to [0-9].
62+
\D Matches Nondigits
63+
- urllib2/requests
64+
65+
***Request***
66+
67+
Introduction
68+
69+
The requests module allows you to send HTTP requests using Python.
70+
71+
Importent Methods
72+
73+
1)get(url,params,args)
74+
Sends a GET request to the specified url
75+
2)Post(url,data,json,args)
76+
Sends a POST request to the specified url
77+
3)delete(url,args)
78+
Sends a DELETE request to the specified url
79+
***Urllib***
80+
81+
Introduction
82+
It is a Python 3 package that allows you to access, and interact with, websites using their URL’s (Uniform Resource Locator). It has several modules for working with URL’s, these are shown in the illustration below:
83+
84+
Urllib.request
85+
Using urllib.request, with urlopen, allows you to open the specified URL.
86+
Urllib.error
87+
This module is used to catch exceptions encountered from url.request
88+
89+
- **Writing a script using the above packages and run it in Docker**.
90+
91+
*web_scraping_sample.py*
92+
import requests
93+
from bs4 import BeautifulSoup
94+
import re
95+
res = requests.get('https://www.lipsum.com/')
96+
soup = BeautifulSoup(res.content, 'html5lib') # If this line causes an error, run 'pip install html5lib' or install html5lib
97+
data=soup.find(re.compile(r'div'),attrs={'id':"Panes"})
98+
print(data.find("lorem"))
99+
qes_list=[]
100+
ans_list=[]
101+
for row in data.findAll("div"):
102+
qes_list.append(row.h2.text)
103+
tempstring=""
104+
counter=0
105+
for i in row.findAll("p"):
106+
tempstring=tempstring+"\n"+i.text
107+
ans_list.append(tempstring)
108+
tempstring=""
109+
for i in range(len(qes_list)):
110+
tempstring=tempstring+"\n"+qes_list[i]+"\n"+ans_list[i]+"\n--------------------------------------------------------------------------------------------------\n\n"
111+
print(tempstring)
112+
113+
***Creating a dockerfile in same directory***
114+
115+
FROM python:3.10.2-alpine3.15
116+
# Create directories
117+
RUN mkdir -p /root/workspace/src
118+
COPY ./web_scraping_sample.py /root/workspace/src
119+
# Switch to project directory
120+
WORKDIR /root/workspace/src
121+
RUN python web_scraping_sample.py
122+
- ***Build dokcer image***
123+
124+
docker build -t simple_python
125+
- ***Run image as a docker container***
126+
127+
docker run -d --name container1 simple_python

docs/webscraping_with_docker.md

+36
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
### Webscrapping with docker.
2+
- Create a new docker file.
3+
4+
FROM python:3.10.2-alpine3.15
5+
# Create directories
6+
RUN mkdir -p /root/workspace/src
7+
COPY ./web_scraping_sample.py /root/workspace/src
8+
# Switch to project directory
9+
WORKDIR /root/workspace/src
10+
11+
- Create a docker-compose file.
12+
13+
version: "3"
14+
services:
15+
pyhton_service:
16+
build:
17+
context: ./
18+
dockerfile: Dockerfile
19+
image: workshop1
20+
container_name: workshop_python_container
21+
stdin_open: true # docker attach container_id
22+
tty: true
23+
ports:
24+
- "8000:8000"
25+
volumes:
26+
- .:/app
27+
- Get the containers up.
28+
29+
docker-compose up -d
30+
31+
- Login to the container.
32+
33+
docker exec -it python_service sh
34+
- Run the script for web scrapping inside the container.
35+
36+
python web_scraping_sample.py

docs/workshop1_home_work.md

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
### Workshop 1 Home Work.
2+
A PR should be given where the data is scrapped from Lorem Ipsum - All the facts - Lipsum generator[ Lorem Ipsum - All the facts - Lipsum generator](https://www.lipsum.com/) website and save each section from that page in the database.

0 commit comments

Comments
 (0)