Welcome to the Linux Foundation Forum!
iconv and sed help
usmangt
Posts: 42
in Command Line
Hi,
I have a file which is a UTF-8 file type which i need to convert into ISO-8859-1 file type.
Now the UTF-8 file type contains characters like å/ä/ö and i dont want these characters.
So, i apply the sed command.
$ sed "s/å/aa/g; s/ä/aaa/g; s/ö/ooo/g" utf8.txt > output.txt
Now when i view this file, there are no such characters like å/ä/ö
Then,
i use iconv command to covert that UTF-8 (output.txt) file type into ISO-8859-1 file type
$ iconv -c -f UTF-8 -t ISO-8859-1 < output.txt > newfile
BUT
when i view the file type using file command it tells that its an ASCII file type not the ISO-8859-1
$ file newfile
newfile: ASCII text, with CRLF line terminators
newfile: ASCII text, with CRLF line terminators
I don't understand what went wrong. I have also attached that UTF-8 file with this post.
Please help.
usmangt
0
Comments
-
I have went through your exact procedure on slackware 13.1 and my output file is showing as:
ut3.txt: ISO-8859 text, with very long lines
The way that the data is read and displayed may be controlled by a deeper configuration within your OS, can you share what distro you use so those familiar with it can tell you where those settings are?0 -
I am using Linux Fedora 13 distribution.0
-
Hi,
I am so Sorry that i have attached the wrong file (actually both are of same name but in different folder on my machine).
This is the one which is causing the problem.0 -
Here is the file.
Don't know why it become such long name when uploading.
[file name=utf8-7a6351909c73ba4a81575d6ad10cf46f.txt size=1131]http://www.linux.com/media/kunena/attachments/legacy/files/utf8-7a6351909c73ba4a81575d6ad10cf46f.txt[/file]0 -
Now that I have processed your original file I am getting the same issue, it appears that something is different between the files.
The two files are very different. I have concatinated your command tosed "s/å/aa/g; s/ä/aaa/g; s/ö/ooo/g" utf8.txt|iconv -c -f UTF-8 -t ISO-8859-1 -o out.txt
when I ran that command against both files I got the following output:matt:~/Desktop$rm *.txt.txt;for i in `ls|grep utf|grep -v "txt\.txt"`;do sed "s/å/aa/g; s/ä/aaa/g; s/ö/ooo/g" $i|iconv -c -f UTF-8 -t ISO-8859-1 -o $i.txt ;file $i;file $i.txt;done utf8.txt: UTF-8 Unicode text, with very long lines, with CRLF line terminators utf8.txt.txt: ISO-8859 text, with very long lines, with CRLF line terminators utf82.txt: UTF-8 Unicode text utf82.txt.txt: ASCII text
Based upon the output it looks as though the line terminators in the second file are not ISO-8859-1 compliant, but the iconv applications does not correct those.0 -
Thank you for analyzing and checking it. Yes i doubt the same thing also concern about the ' - ' ( minus symbol/character ) in the file.
Do you think if there is a solution for this.
Thank you
usmangt0 -
Can you tell me if the two files were created on different platforms, such as file1 being created in windows and file2 being created in Linux?0
-
Well both are created on Linux0
Categories
- 10.1K All Categories
- 35 LFX Mentorship
- 88 LFX Mentorship: Linux Kernel
- 502 Linux Foundation Boot Camps
- 278 Cloud Engineer Boot Camp
- 103 Advanced Cloud Engineer Boot Camp
- 47 DevOps Engineer Boot Camp
- 41 Cloud Native Developer Boot Camp
- 2 Express Training Courses
- 2 Express Courses - Discussion Forum
- 1.7K Training Courses
- 17 LFC110 Class Forum
- 4 LFC131 Class Forum
- 19 LFD102 Class Forum
- 148 LFD103 Class Forum
- 12 LFD121 Class Forum
- 61 LFD201 Class Forum
- LFD210 Class Forum
- 1 LFD213 Class Forum - Discontinued
- 128 LFD232 Class Forum
- 23 LFD254 Class Forum
- 566 LFD259 Class Forum
- 100 LFD272 Class Forum
- 1 LFD272-JP クラス フォーラム
- 1 LFS145 Class Forum
- 22 LFS200 Class Forum
- 739 LFS201 Class Forum
- 1 LFS201-JP クラス フォーラム
- 1 LFS203 Class Forum
- 44 LFS207 Class Forum
- 298 LFS211 Class Forum
- 53 LFS216 Class Forum
- 46 LFS241 Class Forum
- 40 LFS242 Class Forum
- 37 LFS243 Class Forum
- 10 LFS244 Class Forum
- 27 LFS250 Class Forum
- 1 LFS250-JP クラス フォーラム
- 131 LFS253 Class Forum
- 993 LFS258 Class Forum
- 10 LFS258-JP クラス フォーラム
- 87 LFS260 Class Forum
- 126 LFS261 Class Forum
- 31 LFS262 Class Forum
- 79 LFS263 Class Forum
- 15 LFS264 Class Forum
- 10 LFS266 Class Forum
- 17 LFS267 Class Forum
- 17 LFS268 Class Forum
- 21 LFS269 Class Forum
- 200 LFS272 Class Forum
- 1 LFS272-JP クラス フォーラム
- 212 LFW211 Class Forum
- 153 LFW212 Class Forum
- 899 Hardware
- 217 Drivers
- 74 I/O Devices
- 44 Monitors
- 115 Multimedia
- 208 Networking
- 101 Printers & Scanners
- 85 Storage
- 749 Linux Distributions
- 88 Debian
- 64 Fedora
- 14 Linux Mint
- 13 Mageia
- 24 openSUSE
- 133 Red Hat Enterprise
- 33 Slackware
- 13 SUSE Enterprise
- 355 Ubuntu
- 473 Linux System Administration
- 38 Cloud Computing
- 69 Command Line/Scripting
- Github systems admin projects
- 94 Linux Security
- 77 Network Management
- 108 System Management
- 49 Web Management
- 63 Mobile Computing
- 22 Android
- 27 Development
- 1.2K New to Linux
- 1.1K Getting Started with Linux
- 527 Off Topic
- 127 Introductions
- 213 Small Talk
- 19 Study Material
- 794 Programming and Development
- 262 Kernel Development
- 498 Software Development
- 922 Software
- 257 Applications
- 182 Command Line
- 2 Compiling/Installing
- 76 Games
- 316 Installation
- 53 All In Program
- 53 All In Forum
Upcoming Training
-
August 20, 2018
Kubernetes Administration (LFS458)
-
August 20, 2018
Linux System Administration (LFS301)
-
August 27, 2018
Open Source Virtualization (LFS462)
-
August 27, 2018
Linux Kernel Debugging and Security (LFD440)