Python on サブカル科学研究会のブログ

ctranslate2でpytorchがimportできないエラーが出たので回避方法メモ

Fri, 18 Aug 2023 16:23:27 +0900

公式のインストール方法

pip install ctranslate2

https://github.com/OpenNMT/CTranslate2

Macでは上手く動作しない

segmentatioin faultになった。

しかしlinuxでは動作するとの情報を得た。なのでまずはMacの中でDocker環境を構築して問題を回避できないか確認しようとした。

pysenをインストールするとmypy周りで他のライブラリがエラーになる

Sun, 18 Jun 2023 01:53:52 +0900

pysenが入っていたらlangChainと依存関係がバッティングする？

環境

Mac OS poetry python == 3.9

LangChainのインストールに失敗する

poetry add langchain

が失敗する

CLANG 〜

C言語関係のエラーかと思われた。

なんだっけ

xcode-toolsのインストールとアップデート

pip install --update pip
pip install --upgrade setuptools

mypyのバージョンによるエラーらしい

poetry環境でpytorchをインストールしても失敗したのでその対処法

Wed, 07 Jun 2023 05:47:12 +0900

`poetry add torch` でパスが通らない

Dcokerfile＋poetry で環境構築をしていた
poetry add torchでpytorchをインストールした
import torchでエラー
cuda周りのパスが通っていないらしい

対処法

poetry run python -m pip install torch
poetry.tomlに記述することもできるらしい

参考資料

stack overflow

poetry環境でstreamlitを実行する方法

Thu, 24 Feb 2022 16:01:29 +0900

症状

streamlitをpoetryを使ってインストールした場合に、streamlitが実行できない
poetry add streamlitでstreamlitを追加した場合、通常のシェルからはstreamlitのパスが通っていない
which streamlitの実行結果でなにもでてこない

対処

poetry からシェルを実行する
poetry shell
streamlit run sample.py
streamlitコマンドが実行できるようになる
仮想環境にstreamlitをインストールした場合には通常のシェルからはstreamlitを実行できない
その場合の対処法は公式サイトに載っている

参考リンク

pycharmからpoetryで環境の作成ができない

Thu, 24 Feb 2022 11:56:49 +0900

症状

pycharmでinterpreterの指定にエラーが出た。改めてpoetryの環境構築を行おうとしたところ、以下のエラーが出た。

ModuleNotFoundError No module named 'virtualenv.activation.xonsh' at <frozen importlib._bootstrap>:984 in _find_and_load_unlocked

解決方法

pip3 uninstall virtualenv

原因

anyenvのアップデートをかけたのが悪かったか？

反省

不用意なアップデートは不具合の原因になる

Building a python environment with poetry on mac os

Thu, 20 May 2021 20:53:52 +0900

Pip is a major method of installing python libraries.

poetry is a more advanced version control tool for development environments.

It seems to have official support for pyenv integration.

I’ll write down how to install it on mac os and what I got stuck.

Advantages of poetry

Can organize library dependencies.
- There are some unexpected side effects depending on the version of the library.
- Trying to recreate the environment can cause errors with library versions and installation order.
- Building the environment is an inevitable part of human work, so it should be automated if possible.
- Also, it seems to be able to update the library version to take dependencies into account.
- And it keeps a record of the status.
Is it possible to separate the dependency records by git branch?
The libraries you can install are comparable to pip
- Does it have the same references as pypy?
Usability is not much different from pip
- Poetry add instead of pip install
It recognizes virtual environments created with pyenv and works with them.

I’m going to install it because it seems to be a convenient way to build an environment without much effort.

ポートフォリオ

Thu, 08 Apr 2021 13:07:22 +0900

実績

機械学習など
- 画像処理による観賞魚の品種推定モデルの作成
  - AI を用いて観賞魚の品種識別
- 画像処理による犬猫の品種推定モデルのデモ作成
  - アテンションを用いた画像処理モデルの作成
- NVIDIAが公開した画像生成モデルの実行に関するアドバイス
  - coconala
- Rによる統計処理にに関するアドバイス
  - coconala
- GANを用いた画像生成モデルのデモ作成
  - coconala
- 自然言語処理における単語の分散表現モデルの実装と解説を行う本の執筆
  - Googlecolaboratory と pythonで学ぶ初めての自然言語処理入門
- word2vecを使った検索システムのデモ作成
  - word2vecでteratailの検索システムっぽいものを作る
- ネットニュースなどの見出し作成モデルの実装
  - 短文ニュースを作るAIをつくったった
  - 深層学習でニュースタイトルの自動生成モデルつくったった
アプリなど
- google driveの更新を通知するline botの作成
  - Google Apps Scriptで後輩系line bot 作ったった
- スコアを計算するtodoアプリの作成
  - AppSheet でゲーミフィケーション要素のあるtodoのアプリをつくったった
  - vue.jsでtodoリスト作成
ホームページのデモ
- 放課後デイサービスのサンプルサイト

pythonで環境構築の為にきれいなrequirements.txtを作成する方法

Wed, 17 Mar 2021 12:00:16 +0900

python で環境構築をする際にrequirements.txtを利用することがある。

ただ、素直にrequirements.txtを作ると環境構築の再現性が低い場合がある。

pythonに引数をとらせるArgparseの例

Fri, 26 Feb 2021 15:17:54 +0900

サンプルプログラム

以下公式サイトより引用

import argparse

parser.add_argument("square", type=int,
                    help="display a square of a given number")
parser.add_argument("-v", "--verbose", action="store_true",
                    help="increase output verbosity")
args = parser.parse_args()
answer = args.square**2

解説

parser.add_argument("square", type=int,
                help="display a square of a given number")

引数の名前がsquare

型を指定できる。デフォルトはstr。

前に-がつかない名前は位置引数。

python janome 0.4系からstreamモードがデフォルト解決策のメモ

Wed, 17 Feb 2021 10:31:12 +0900

janomeのバージョンを上げると、分かち書きの出力をgeneratorで返すようになった。

generatorはメモリ効率が良いというメリットがある一方でリストでデータを保持したいという気持ちがある。

pythonでpandasを使ってエクセルを読み込み失敗するときの対処

Wed, 17 Feb 2021 10:29:42 +0900

xlrdのバージョンによって、.xlsxファイルの読み込みに失敗する。

対策としてはバージョンを下げる。

pip3 install xlrd==1.2.0

参考リンク

https://qiita.com/fujitatsu0520/items/9e37c2bd2ba2adfd18d4

tensorflow GPUメモリを一気に確保しない設定の仕方

Mon, 25 Jan 2021 08:17:54 +0900

tensorflow 1系は使える全てのGPUリソースを確保する。

メモリオーバーを観測できないので、逐次的にGPUメモリを確保するように設定を変更する。

これによってGPU使用量の観測ができる。

deeplubcut 動画を対象にした点の位置の予測

Sun, 24 Jan 2021 12:56:01 +0900

点のアノテーションの予測

蝿の腹、マウスの脊椎、指の関節など応用範囲が広い。
動画でできてる。デモがある。
動画の特徴量抽出はResNet, mobileNetなど
- Mobile Netでできるならエッジコンピューティングが視野に入る
- ラズパイ＋GPUみたいな構成

参考リンク

https://github.com/DeepLabCut/DeepLabCut

pycaret メモ

Sun, 24 Jan 2021 12:52:46 +0900

sk learnと似ている？
gpu使える

参考リンク

https://pycaret.org

Using BART (sentence summary model) with hugging face

Tue, 19 Jan 2021 03:03:11 +0900

BART is a model for document summarization
Derived from the same transformer as BERT
Unlike BERT, it has an encoder-decoder structure
- This is because it is intended for sentence generation

This page shows the steps to run a tutorial on BART.

Procedure

install transformers

Run ``sh pip install transformers

Run summary

2. Run the summary
from transformers import BartTokenizer, BartForConditionalGeneration, BartConfig

model = BartForConditionalGeneration.from_pretrained('facebook/bart-large')
tokenizer = BartTokenizer.from_pretrained('facebook/bart-large')

ARTICLE_TO_SUMMARIZE = "My friends are cool but they eat too many carbs."
inputs = tokenizer([ARTICLE_TO_SUMMARIZE], max_length=1024, return_tensors='pt')

# Generate Summary
summary_ids = model.generate(inputs['input_ids'], num_beams=4, max_length=5, early_stopping=True)
print([tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=False) for g in summary_ids])
```

On 2021/01/18, the output was MyMy friends.

Interesting.

## Where I got stuck.
Error when the version of pytorch is different from the one specified in transformers.

pip install -U torch

python でのmlflowの使い方

Sat, 18 Jul 2020 16:24:00 +0900

python でmlflow使うメモ

実験結果を比較するために便利っぽいのでmlflow を使ってみた。

パラメータと実験結果の記録をある程度自動化できる。

機械学習の実践はある種の黒魔術となることが多いので再現性を担保するための努力は後々に影響する。

Procedure for obtaining a distributed representation of a Japanese sentence using a trained Universal Sentence Encoder

Mon, 22 Jun 2020 18:29:00 +0900

A vector of documents can be obtained using Universal Sentence Encoder.

Features

Supports multiple languages.

Japanese is supported.

Can handle Japanese sentences as vectors.

Usage

Clustering, similarity calculation, feature extraction.

Usage

Execute the following command as preparation.

pip install tensorflow tensorflow_hub tensorflow_text numpy

Trained models are available.

See the python description below for details on how to use it.

import tensorflow_hub as hub  
import tensorflow_text
import numpy as np  
# for avoiding error  
import ssl  
ssl._create_default_https_context = ssl._create_unverified_context  

def cos_sim(v1, v2):  
   return np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))  
  
embed = hub.load("https://tfhub.dev/google/universal-sentence-encoder-multilingual/3")  
  
texts = ["I saw a comedy show yesterday." , "There was a comedy show on TV last night." , "I went to the park yesterday." , "I saw a comedy show last night.", "Yesterday, I went to the park."]  
vectors = embed(texts)  
```.

See the following link for more details

[Try Universal Sentence Encoder in Japanese](https://qiita.com/kenta1984/items/9613da23766a2578a27a)


### Postscript
```py
import tensorflow_text

Without this line, you will get an error like ``Sentencepiece not found! error. This line is not explicitly used in the sample source, but is required for the actual execution. This line is not explicitly used in the sample source, but is required in the actual runtime.

python を用いた自然言語処理の環境を整える

Thu, 18 Jun 2020 07:34:00 +0900

この記事はコマンドラインをある程度利用できる方に向けて書いています。

コマンドラインがなにか分からないけど、pythonを使いたい方はGoogle Colaboratory というサービスの利用を検討してください。

word2vecのアルゴリズムを把握するためにnotebookで動かしながら挙動を理解しよう

Wed, 17 Jun 2020 07:36:00 +0900

word2vecを理解しよう！

word2vec のアルゴリズムについて、勉強しようとして苦戦していませんか？
- アルゴリズムの基になる発想は意外に直観的なものですが、その直観をアルゴリズムの記述から読み取るのはコツが要るかもしれません。
- 実際に動くモデルで遊んでみて、反応をみながら感覚を掴むといいと思います。
- 一行単位で実行できるプログラムを自分の手で動かしながら、出力を確認できると分かりやすいと思いませんか？

環境構築不要！

そこでGoogle Colaboratory というサービスを利用して、手軽にword2vecを動かして、アルゴリズムの仕組みを理解しましょう！
- Google Colaboratory はGoogleが提供しているサービスです。
- Gmailのアカウントを持っていれば環境構築の手間が省け、Googleの計算資源を利用できるものです。
そこでword2vecを動かせるプログラムを用意しました。
このプログラムは技術書典というイベントで頒布させていただき、50以上の方に利用していただきました。

購入は以下のリンクから

詳細は以下のリンクからどうぞ。
- word2vecのアルゴリズムを把握するためにプログラムを動かしながら挙動を理解しよう

A note on how to use BERT learned from Japanese Wikipedia, now available

Wed, 17 Jun 2020 07:34:00 +0900

huggingface has released a Japanese model for BERT.

The Japanese model is included in transformers.

However, I stumbled over a few things before I could get it to actually work in a Mac environment, so I’ll leave a note.

Preliminaries: Installing mecab

The morphological analysis engine, mecab, is required to use BERT’s Japanese model.

The tokenizer will probably ask for mecab.

This time, we will use homebrew to install Mecab and ipadic.

Macでpyenvを利用したpythonの環境構築の方法

Tue, 16 Jun 2020 04:58:00 +0900

Mac でpythonの環境構築

Mac にpythonをどうやってインストールしたらいいのか悩んでいませんか？

単純にhome brewを使ってインストールしてもいいのですが、以下のデメリットがあります。

How to use NeuralClassifier, a library that provides a crazy number of models for document classification problems

Mon, 15 Jun 2020 02:11:00 +0900

[! [](https://1.bp.blogspot.com/-YlMb8v77MN4/XurdQSzS1yI/AAAAAAAAg6Y/oSZrJ0c9yxYbzQnNNTynRvZnEp-xGE7NwCK4BGAsYHg/s320/AFE90C8A-A49C- 4475-9F05-50E2D56D5B63.jpeg)](https://1.bp.blogspot.com/-YlMb8v77MN4/XurdQSzS1yI/AAAAAAAAg6Y/oSZrJ0c9yxYbzQnNNTynRvZnEp-xGE7NwCK4 BGAsYHg/s1920/AFE90C8A-A49C-4475-9F05-50E2D56D5B63.jpeg)

NeuralClassifier: An Open-source Neural Hierarchical Multi-label Text Classification Toolkit is a python library for multi-label document classification problems published by Tencent.

For more information, see

[NeuralClassifier: An Open-source Neural Hierarchical Multi-label Text Classification Toolkit](https://github.com/Tencent/NeuralNLP- NeuralClassifier) NeuralClassifier is designed for quick implementation of neural models for hierarchical multi-label classification task, which is more challenging and common in real-world scenarios.

for more details.

NeuralClassifier is designed for quick implementation of neural models for hierarchical multi-label classification task, which is more challenging and common in real-world scenarios.

Python on サブカル科学研究会のブログ

ctranslate2でpytorchがimportできないエラーが出たので回避方法メモ

公式のインストール方法

Macでは上手く動作しない

pysenをインストールするとmypy周りで他のライブラリがエラーになる

pysenが入っていたらlangChainと依存関係がバッティングする？

環境

LangChainのインストールに失敗する

poetry環境でpytorchをインストールしても失敗したのでその対処法

poetry add torch でパスが通らない

対処法

参考資料

poetry環境でstreamlitを実行する方法

症状

対処

参考リンク

pycharmからpoetryで環境の作成ができない

症状

解決方法

原因

反省

Building a python environment with poetry on mac os

Advantages of poetry

ポートフォリオ

実績

pythonで環境構築の為にきれいなrequirements.txtを作成する方法

pythonに引数をとらせるArgparseの例

サンプルプログラム

解説

python janome 0.4系からstreamモードがデフォルト 解決策のメモ

pythonでpandasを使ってエクセルを読み込み失敗するときの対処

参考リンク

tensorflow GPUメモリを一気に確保しない設定の仕方

deeplubcut 動画を対象にした点の位置の予測

参考リンク

pycaret メモ

参考リンク

Using BART (sentence summary model) with hugging face

Procedure

python でのmlflowの使い方

python でmlflow使うメモ

Procedure for obtaining a distributed representation of a Japanese sentence using a trained Universal Sentence Encoder

Features

Usage

Usage

python を用いた自然言語処理の環境を整える

word2vecのアルゴリズムを把握するためにnotebookで動かしながら挙動を理解しよう

word2vecを理解しよう！

環境構築不要！

購入は以下のリンクから

A note on how to use BERT learned from Japanese Wikipedia, now available

Preliminaries: Installing mecab

Macでpyenvを利用したpythonの環境構築の方法

Mac でpythonの環境構築

How to use NeuralClassifier, a library that provides a crazy number of models for document classification problems

`poetry add torch` でパスが通らない

python janome 0.4系からstreamモードがデフォルト解決策のメモ