xlrdのバージョンによって、.xlsxファイルの読み込みに失敗する。
対策としてはバージョンを下げる。
pip3 install xlrd==1.2.0
xlrdのバージョンによって、.xlsxファイルの読み込みに失敗する。
対策としてはバージョンを下げる。
pip3 install xlrd==1.2.0
tensorflow 1系は使える全てのGPUリソースを確保する。
メモリオーバーを観測できないので、 逐次的にGPUメモリを確保するように設定を変更する。
これによってGPU使用量の観測ができる。
[Read More]点のアノテーションの予測
https://github.com/DeepLabCut/DeepLabCut
[Read More]This page shows the steps to run a tutorial on BART.
Run ``sh pip install transformers
Run summary
2. Run the summary
from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig
model = BartForConditionalGeneration.from_pretrained('facebook/bart-large')
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')
ARTICLE_TO_SUMMARIZE = "My friends are cool but they eat too many carbs."
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='pt')
# Generate Summary
summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=5, early_stopping=True)
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
```
On 2021/01/18, the output was MyMy friends.
Interesting.
## Where I got stuck.
Error when the version of pytorch is different from the one specified in transformers.
pip install -U torch
[Read More]実験結果を比較するために便利っぽいのでmlflow を使ってみた。
パラメータと実験結果の記録をある程度自動化できる。
機械学習の実践はある種の黒魔術となることが多いので再現性を担保するための努力は後々に影響する。
[Read More].
A vector of documents can be obtained using Universal Sentence Encoder.
Supports multiple languages.
Japanese is supported.
Can handle Japanese sentences as vectors.
Clustering, similarity calculation, feature extraction.
Execute the following command as preparation.
pip install tensorflow tensorflow_hub tensorflow_text numpy
Trained models are available.
See the python description below for details on how to use it.
import tensorflow_hub as hub
import tensorflow_text
import numpy as np
# for avoiding error
import ssl
ssl._create_default_https_context = ssl._create_unverified_context
def cos_sim(v1, v2):
return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))
embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual/3")
texts = ["I saw a comedy show yesterday." , "There was a comedy show on TV last night." , "I went to the park yesterday." , "I saw a comedy show last night.", "Yesterday, I went to the park."]
vectors = embed(texts)
```.
See the following link for more details
[Try Universal Sentence Encoder in Japanese](https://qiita.com/kenta1984/items/9613da23766a2578a27a)
### Postscript
```py
import tensorflow_text
Without this line, you will get an error like ``Sentencepiece not found! error. This line is not explicitly used in the sample source, but is required for the actual execution. This line is not explicitly used in the sample source, but is required in the actual runtime.
[Read More]この記事はコマンドラインをある程度利用できる方に向けて書いています。
コマンドラインがなにか分からないけど、pythonを使いたい方はGoogle Colaboratory というサービスの利用を検討してください。
[Read More]huggingface has released a Japanese model for BERT.
The Japanese model is included in transformers.
However, I stumbled over a few things before I could get it to actually work in a Mac environment, so I’ll leave a note.
The morphological analysis engine, mecab, is required to use BERT’s Japanese model.
The tokenizer will probably ask for mecab.
This time, we will use homebrew to install Mecab and ipadic.
[Read More]