Welcome to the Linux Foundation Forum!
iconv and sed help
usmangt
Posts: 42
in Command Line
Hi,
I have a file which is a UTF-8 file type which i need to convert into ISO-8859-1 file type.
Now the UTF-8 file type contains characters like å/ä/ö and i dont want these characters.
So, i apply the sed command.
$ sed "s/å/aa/g; s/ä/aaa/g; s/ö/ooo/g" utf8.txt > output.txt
Now when i view this file, there are no such characters like å/ä/ö
Then,
i use iconv command to covert that UTF-8 (output.txt) file type into ISO-8859-1 file type
$ iconv -c -f UTF-8 -t ISO-8859-1 < output.txt > newfile
BUT
when i view the file type using file command it tells that its an ASCII file type not the ISO-8859-1
$ file newfile
newfile: ASCII text, with CRLF line terminators
newfile: ASCII text, with CRLF line terminators
I don't understand what went wrong. I have also attached that UTF-8 file with this post.
Please help.
usmangt
0
Comments
-
I have went through your exact procedure on slackware 13.1 and my output file is showing as:
ut3.txt: ISO-8859 text, with very long lines
The way that the data is read and displayed may be controlled by a deeper configuration within your OS, can you share what distro you use so those familiar with it can tell you where those settings are?0 -
I am using Linux Fedora 13 distribution.0
-
Hi,
I am so Sorry that i have attached the wrong file (actually both are of same name but in different folder on my machine).
This is the one which is causing the problem.0 -
Here is the file.
Don't know why it become such long name when uploading.
[file name=utf8-7a6351909c73ba4a81575d6ad10cf46f.txt size=1131]http://www.linux.com/media/kunena/attachments/legacy/files/utf8-7a6351909c73ba4a81575d6ad10cf46f.txt[/file]0 -
Now that I have processed your original file I am getting the same issue, it appears that something is different between the files.
The two files are very different. I have concatinated your command tosed "s/å/aa/g; s/ä/aaa/g; s/ö/ooo/g" utf8.txt|iconv -c -f UTF-8 -t ISO-8859-1 -o out.txt
when I ran that command against both files I got the following output:matt:~/Desktop$rm *.txt.txt;for i in `ls|grep utf|grep -v "txt\.txt"`;do sed "s/å/aa/g; s/ä/aaa/g; s/ö/ooo/g" $i|iconv -c -f UTF-8 -t ISO-8859-1 -o $i.txt ;file $i;file $i.txt;done utf8.txt: UTF-8 Unicode text, with very long lines, with CRLF line terminators utf8.txt.txt: ISO-8859 text, with very long lines, with CRLF line terminators utf82.txt: UTF-8 Unicode text utf82.txt.txt: ASCII text
Based upon the output it looks as though the line terminators in the second file are not ISO-8859-1 compliant, but the iconv applications does not correct those.0 -
Thank you for analyzing and checking it. Yes i doubt the same thing also concern about the ' - ' ( minus symbol/character ) in the file.
Do you think if there is a solution for this.
Thank you
usmangt0 -
Can you tell me if the two files were created on different platforms, such as file1 being created in windows and file2 being created in Linux?0
-
Well both are created on Linux0
Categories
- 8.9K All Categories
- 13 LFX Mentorship
- 66 LFX Mentorship: Linux Kernel
- 364 Linux Foundation Boot Camps
- 231 Cloud Engineer Boot Camp
- 70 Advanced Cloud Engineer Boot Camp
- 25 DevOps Engineer Boot Camp
- 5 Cloud Native Developer Boot Camp
- 852 Training Courses
- 15 LFC110 Class Forum
- 16 LFD102 Class Forum
- 102 LFD103 Class Forum
- 3 LFD121 Class Forum
- 55 LFD201 Class Forum
- 1 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum
- 19 LFD254 Class Forum
- 431 LFD259 Class Forum
- 86 LFD272 Class Forum
- 1 LFD272-JP クラス フォーラム
- 16 LFS200 Class Forum
- 694 LFS201 Class Forum
- LFS201-JP クラス フォーラム
- 271 LFS211 Class Forum
- 50 LFS216 Class Forum
- 26 LFS241 Class Forum
- 27 LFS242 Class Forum
- 19 LFS243 Class Forum
- 6 LFS244 Class Forum
- 9 LFS250 Class Forum
- LFS250-JP クラス フォーラム
- 108 LFS253 Class Forum
- 791 LFS258 Class Forum
- 7 LFS258-JP クラス フォーラム
- 51 LFS260 Class Forum
- 79 LFS261 Class Forum
- 13 LFS262 Class Forum
- 76 LFS263 Class Forum
- 14 LFS264 Class Forum
- 10 LFS266 Class Forum
- 8 LFS267 Class Forum
- 9 LFS268 Class Forum
- 6 LFS269 Class Forum
- 180 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- 187 LFW211 Class Forum
- 103 LFW212 Class Forum
- 878 Hardware
- 207 Drivers
- 74 I/O Devices
- 43 Monitors
- 115 Multimedia
- 204 Networking
- 98 Printers & Scanners
- 82 Storage
- 724 Linux Distributions
- 82 Debian
- 64 Fedora
- 12 Linux Mint
- 13 Mageia
- 22 openSUSE
- 126 Red Hat Enterprise
- 33 Slackware
- 13 SUSE Enterprise
- 347 Ubuntu
- 447 Linux System Administration
- 33 Cloud Computing
- 64 Command Line/Scripting
- Github systems admin projects
- 89 Linux Security
- 73 Network Management
- 105 System Management
- 45 Web Management
- 50 Mobile Computing
- 18 Android
- 19 Development
- 1.2K New to Linux
- 1.1K Getting Started with Linux
- 499 Off Topic
- 119 Introductions
- 193 Small Talk
- 19 Study Material
- 748 Programming and Development
- 240 Kernel Development
- 474 Software Development
- 902 Software
- 247 Applications
- 178 Command Line
- 2 Compiling/Installing
- 72 Games
- 314 Installation
- 20 All In Program
- 20 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)