pythonで環境構築の為にきれいなrequirements.txtを作成する方法

Posted on Wed Mar 17 2021 | 2 min | 625 words |

python で環境構築をする際にrequirements.txtを利用することがある。

ただ、素直にrequirements.txtを作ると環境構築の再現性が低い場合がある。

[Read More]

技術 Python requirements.txt

MLops　実験開発環境の整備の必要性

Posted on Fri Feb 26 2021 | 1 min | 240 words |

MLops

レベル1だけでもやっとけばええんちゃうか

やること

データの前処理と保存
特徴量の抽出と保存
分類器の学習と保存
それぞれのパラメータの保存

利点

実験する部分や追加したい機能の切り分けがしやすくなる。

[Read More]

機械学習技術 Python

pythonに引数をとらせるArgparseの例

Posted on Fri Feb 26 2021 | 1 min | 327 words |

サンプルプログラム

以下公式サイトより引用

import argparse

parser.add_argument("square", type=int,
                    help="display a square of a given number")
parser.add_argument("-v", "--verbose", action="store_true",
                    help="increase output verbosity")
args = parser.parse_args()
answer = args.square**2

解説

parser.add_argument("square", type=int,
                help="display a square of a given number")

引数の名前がsquare

型を指定できる。デフォルトはstr。

前に-がつかない名前は位置引数。

[Read More]

Python 技術

Creating data in Natural Language Inference (NLI) format for Sentence transformer

Posted on Wed Feb 17 2021 | 2 min | 376 words |

Using the Sentence Transformer to I’m trying to use Sentence Transformer to infer causal relationships between documents.

If we can do this, we can extract the cause and symptoms of the incident from the report.

So, I wondered if NLI could be used for feature learning to extract causal information. I thought.

What is NLI?

Inference of the relationship between two sentences

Forward
Inverse
Unrelated

The three relations are.

Apply to causal relationships

If we apply the three relationships of NLI to causality, the following patterns are possible.

[Read More]

NLI Sentence Transformers technology natural language processing document classification machine learning Python

python janome 0.4系からstreamモードがデフォルト解決策のメモ

Posted on Wed Feb 17 2021 | 1 min | 193 words |

janomeのバージョンを上げると、分かち書きの出力をgeneratorで返すようになった。

generatorはメモリ効率が良いというメリットがある一方でリストでデータを保持したいという気持ちがある。

[Read More]

技術 Python janome 自然言語処理機械学習

pythonでpandasを使ってエクセルを読み込み失敗するときの対処

Posted on Wed Feb 17 2021 | 1 min | 79 words |

xlrdのバージョンによって、.xlsxファイルの読み込みに失敗する。

対策としてはバージョンを下げる。

pip3 install xlrd==1.2.0

参考リンク

https://qiita.com/fujitatsu0520/items/9e37c2bd2ba2adfd18d4

関連書籍

Pythonではじめる機械学習

技術 Python xlrd xlsx pandas

On the use of distributed representations bagging for class classification and generalization performance

Posted on Thu Feb 4 2021 | 2 min | 410 words |

After the distributed representation has been obtained, the After the distributed representation is obtained, machine learning can be used to classify it.

Models that can be used include

Decision Tree
SVM Support Vector Machine
NN Neural Networks

and others.

SVM is included in NN in a broad sense.

In this section, we will use the decision tree method.

Bagging

Image of majority voting with multiple decision trees
Simple theory
- Decision trees are highly explainable and are a classic machine learning model.
- Computational load is light compared to deep learning
  - Depends on the size of the model
Not much explainability
- Do we want to analyze each of the multiple decision trees?

``py from sklearn.ensemble import BaggingClassifier from sklearn.tree import DecisionTreeClassifier

[Read More]

distributed representation engineering machine learning generalization performance technology Python

How to train a Japanese model with Sentence transformer to get a distributed representation of a sentence

Posted on Wed Feb 3 2021 | 3 min | 508 words |

. BERT is a model that can be powerfully applied to natural language processing tasks.

However, it does not do a good job of capturing sentence-wise features.

Some claim that sentence features appear in [ CLS\ ], but This paper](https://arxiv.org/abs/1908.10084) claims that it does not contain that much useful information for the task.

Sentence BERT is a model that extends BERT to be able to obtain features per sentence.

The following are the steps to create Sentence BERT in Japanese.

[Read More]

technology natural language processing BERT distributed representation Sentence Transformers machine learning Python

tensorflow GPUメモリを一気に確保しない設定の仕方

Posted on Mon Jan 25 2021 | 1 min | 197 words |

TensorFlow 1系は使える全てのGPUリソースを確保する。

メモリオーバーを観測できないので、逐次的にGPUメモリを確保するように設定を変更する。

これによってGPU使用量の観測ができる。

[Read More]

技術 Python tensorflow gpu 機械学習

deeplubcut 動画を対象にした点の位置の予測

Posted on Sun Jan 24 2021 | 1 min | 145 words |

点のアノテーションの予測

蝿の腹、マウスの脊椎、指の関節など応用範囲が広い。
動画でできてる。デモがある。
動画の特徴量抽出はResNet, mobileNetなど
- Mobile Netでできるならエッジコンピューティングが視野に入る
- ラズパイ＋GPUみたいな構成

参考リンク

https://github.com/DeepLabCut/DeepLabCut

[Read More]

Python 画像処理技術機械学習

pythonで環境構築の為にきれいなrequirements.txtを作成する方法

MLops 実験開発環境の整備の必要性

MLops

やること

利点

pythonに引数をとらせるArgparseの例

サンプルプログラム

解説

Creating data in Natural Language Inference (NLI) format for Sentence transformer

What is NLI?

Apply to causal relationships

python janome 0.4系からstreamモードがデフォルト 解決策のメモ

pythonでpandasを使ってエクセルを読み込み失敗するときの対処

参考リンク

On the use of distributed representations bagging for class classification and generalization performance

Bagging

How to train a Japanese model with Sentence transformer to get a distributed representation of a sentence

tensorflow GPUメモリを一気に確保しない設定の仕方

deeplubcut 動画を対象にした点の位置の予測

参考リンク

MLops　実験開発環境の整備の必要性

python janome 0.4系からstreamモードがデフォルト解決策のメモ