Skip to content

Commit

Permalink
2024.12.24
Browse files Browse the repository at this point in the history
  • Loading branch information
huangrt01 committed Dec 24, 2024
1 parent cbbe198 commit 0c94dc7
Show file tree
Hide file tree
Showing 8 changed files with 199 additions and 50 deletions.
38 changes: 35 additions & 3 deletions Notes/AI-Algorithms.md
Original file line number Diff line number Diff line change
Expand Up @@ -565,6 +565,12 @@ CLIP, developed by OpenAI, is a model designed to understand and relate images a
- **Imprecise Multimodal Alignment:** The alignment between text and images can be imprecise, especially when dealing with complex or nuanced relationships.
- **Retrieval Performance Variability:** CLIP's performance can vary depending on the specificity of the query and the image, sometimes leading to suboptimal results.

#### CoCa

https://research.google/blog/image-text-pre-training-with-contrastive-captioners/



#### Visualized BGE (Bootstrapped Grid Embedding)

**How Does Visualized BGE Work?**
Expand Down Expand Up @@ -1859,7 +1865,7 @@ response_of_comparation = response.choices[0].message.content return response_of



## ImageSearch
## Multi-modal Search

### Intro

Expand Down Expand Up @@ -2099,9 +2105,35 @@ response_of_comparation = response.choices[0].message.content return response_of

### Application

Aliyun
* Aliyun
* https://help.aliyun.com/zh/image-search/developer-reference/api-searchbypic?spm=a2c4g.11186623.help-menu-66413.d_4_3_1_3.7538364fjOQka0&scm=20140722.H_202282._.OR_help-V_1

* Google:https://cloud.google.com/blog/products/ai-machine-learning/multimodal-generative-ai-search
* https://ai-demos.dev/demos/matching-engine
* https://atlas.nomic.ai/map/vertexAI-mercari 可视化
* ![image-20241221224534885](./AI-Algorithms/image-20241221224534885.png)



### Cases

* 电商

* *"cups with dancing people"*

* *"handmade accessories with black and white beads"*

* *"Cups in the Google logo colors"*

* *"Shirts that says my birthday"*

* https://help.aliyun.com/zh/image-search/developer-reference/api-searchbypic?spm=a2c4g.11186623.help-menu-66413.d_4_3_1_3.7538364fjOQka0&scm=20140722.H_202282._.OR_help-V_1
* 自动驾驶
* "a crossing road with red lights on and pedestrians are standing,"
* "a crushed car stopping in the middle of the freeway ahead"
* 安防
* a person trying to open the doors,
* water is flooding in the factory
* the machines are on fire.



Expand Down
Binary file added Notes/AI-Algorithms/image-20241221224534885.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
38 changes: 38 additions & 0 deletions Notes/Machine-Learning.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,41 @@ Materials
- We can apply LAMB normalization to any base optimizer
- But the learning rate must be re-tuned

#### 激活函数

* Intro
* 选激活函数 https://machinelearningmastery.com/choose-an-activation-function-for-deep-learning/
* When using the ReLU function for hidden layers, it is a good practice to use a “*He Normal*” or “*He Uniform*” weight initialization and scale input data to the range 0-1 (normalize) prior to training.
* 典型问题:XOR问题
* ![image-20241221123336418](./Machine-Learning/image-20241221123336418.png)



* sigmoid函数
* 1/(1+e^(-x))
* 非常适合作为模型的输出函数用于输出一个0~1范围内的概率值
* 已经不太受欢迎,实际中很少作为激活函数
* 容易造成梯度消失。我们从导函数图像中了解到sigmoid的导数都是小于0.25的,那么在进行反向传播的时候,梯度相乘结果会慢慢的趋向于0。这样几乎就没有梯度信号通过神经元传递到前面层的梯度更新中,因此这时前面层的权值几乎没有更新,这就叫梯度消失。除此之外,为了防止饱和,必须对于权重矩阵的初始化特别留意。如果初始化权重过大,可能很多神经元得到一个比较小的梯度,致使神经元不能很好的更新权重提前饱和,神经网络就几乎不学习。
* 函数输出不是以 0 为中心的,梯度可能就会向特定方向移动,从而降低权重更新的效率
* 指数计算消耗资源
* Tanh(x)=2Sigmoid(2x)−1
* 相比sigmoid,以0为中心
* ReLU
* 优点:
* ReLU解决了梯度消失的问题,当输入值为正时,神经元不会饱和
* 由于ReLU线性、非饱和的性质,在SGD中能够快速收敛
* 计算复杂度低,不需要进行指数运算
* 缺点:
* 输出不是以0为中心的
* Dead ReLU 问题:要设置一个合适的较小的学习率

* Leaky ReLU:解决了ReLU输入值为负时神经元出现的死亡的问题
* 函数中的α,需要通过先验知识人工赋值(一般设为0.01)
* 有些近似线性,导致在复杂分类中效果不好。
* Parametric ReLU:alpha可学习
* ELU:
* ![image-20241221141507094](./Machine-Learning/image-20241221141507094.png)

#### Tuning

https://github.com/google-research/tuning_playbook
Expand Down Expand Up @@ -1376,6 +1411,9 @@ NLU: Natural Language Understanding

### TODO

* 调参
* https://github.com/google-research/tuning_playbook

* 传统关键词检索
* https://www.elastic.co/cn/blog/implementing-academic-papers-lessons-learned-from-elasticsearch-and-lucene
* 对比学习
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions Notes/snippets/profile-linux.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
htop
Loading

0 comments on commit 0c94dc7

Please sign in to comment.