convmv @ 藍色情懷 :: 痞客邦 ::

convmv - converts filenames from one encoding to another
將文件名由一種編碼轉換為另一種編碼

命令的用法與 iconv(1)(*) 類似。

-f ENCODING
specify the current encoding of the filename(s) from which should be converted 原文件的文件名編碼
-t ENCODING
specify the encoding to which the filename(s) should be converted 目標文件的文件名編碼
-r
recursively go through directories 遞歸地處理子目錄
–list
list all available encodings. To get support for more Chinese or Japanese encodings install the Perl HanExtra or JIS2K Encode packages. 列出已知的編碼。要支持更多的中文或日文編碼，安裝 Perl-HanExtra
–nosmart
by default convmv will detect if a filename is already UTF8 encoded and will skip this file if conversion from some charset to UTF8 should be performed. –nosmart will also force conversion to UTF-8 for such files, which might result in 「double encoded UTF-8′’ (see section below). 如果文件名已經是 utf-8 那麼 convmv 會自動探測出來，不做處理，但是如果不想讓 convmv 自動探測，就使用這個選項
–help
print a short summary of available options 輸出幫助信息

設想這樣的場景
1. 你以前用過 redhat 9 甚至更老的系統，你的系統中默認編碼是 gb2312，你的文件名有不少是中文的。然後你跟隨 fedora 的腳步，升級到了最新的 fedora core，默認編碼是 utf8。你發現過去留下來那些文件的文件名都變成了亂碼，或者?#38750;法的 utf-8 序列?
2. 你全新安裝了最新的 fedora core 3 系統，然後按照網上各種各樣的說法，在安裝 fcitx 的時候把默認編碼從 utf8 改成了 gb2312 或者 gbk 甚至是 gb18030，這種改動真的是非常簡單，並且在 mount 的時候要多一句 mount -o iocharset=cp936 (命令行真是長得太過分了)。然後突然有一天，你不得不換用其他語言登錄系統。和上面一樣，所有中文的文件名都變成了?#38750;法的 utf-8 序列?
3. 你用著 utf-8 默認編碼。然後，你開了一個 ftp，讓大家上傳些東西；或者你用著古老的 samba 2.x 版本。你會發現別人上傳的文件名是中文的文件都變成了問號。還有，如果不是用版上那位可敬的同志修改的 gftp 下載文件，那麼下載到的東西凡是中文文件名就都不可識別了，?#38750;法的 utf-8 序列?

convmv is meant to help convert a single filename, a directory tree and the contained files or a whole filesystem into a different encoding. It just converts the filenames, not the content of the files. A special feature of convmv is that it also takes care of symlinks, also converts the symlink target pointer in case the symlink target is being converted, too.

convmv 可以處理單個文件，某個目錄樹以及其中的文件，甚至整個文件系統，將其中的文件名和目錄名轉為另一種編碼。它只對文件名進行操作，而不修改文件內容。它會正確處理鏈接，將鏈接目標指向轉換後的文件。如果某個目錄中只有一部分文件名是 utf-8，而另外一部分是傳統的編碼， convmv 也可以處理這種情況??它會自動判斷編碼類型，只轉換需要的那一些。

如果某些文件名已經是 utf-8 編碼了，你卻禁止 convmv 自動判斷編碼類型，導致這些文件名也被轉換了一次，那麼可以用 convmv 來將它們恢復原狀，只要將 -f 和 -t 反過來用就可以了。

–qfrom 選項十分有用，它的意思是處理過程中不要輸出原文件名。因為這些文件名在你當前的終端中是亂碼，它會把終端搞得一團糟。

參見 locale(1) utf-8(7) charsets(7)
作者 Bjoern JACKE (Send mail to bjoern [at] j3e.de for bug reports and suggestions.)

(*)
iconv(1) 用於將一個文件的內容由一種編碼轉換為另一種編碼，例如，假如某個文檔 input.txt 是從 windows 中編輯的，是 gb2312 編碼。在默認編碼為 utf8 的 fc3 環境中，如果使用 gedit 打開，那麼 gedit 會自動識別；但是如果使用 vi 打開，那麼就會看到亂碼了，這時候就應該先用 iconv 來將文件轉碼再打開。-f 是 from，是原文件的編碼；-t 是 to，是轉換目標文件的編碼；-o 是輸出文件名，注意千萬不要輸出到原文件裡

Bluelove1968

藍色情懷

Bluelove1968 發表在痞客邦留言(0) 人氣()

E-mail轉寄

藍色情懷

歡迎光臨藍色情懷在痞客邦的小天地

convmv

留言列表