Monday, February 8, 2010

Removing special characters from a unix filename in Java

Oh well! this was a very small issue but it took up a lot of our time to figure out how exactly to rename some images which happened to have a special characters in their name. What was happening was that a file which should have appeared something like this when
we did
>ls -l
-rwxrwxrwx -- 1234560.jpg
was appearing as
.jpgrwxrwx -- 1234560

and our attempts to rename it using unix didnot succeed. Well, we could have manually renamed such files using the ftp client we used. But we wanted to have a programmatic way of doing this. Finally what worked well was simple java file rename.


file.renameTo(new File(newName));

The key was to find the file references. One way to obtain the file references is to find out parent directories of such files and then list out the files and rename them. When we tried to get file references by using filenames directly, we were not able to do so.
For this we used a unix command find to list out all the files which we wanted to correct. In our case alll files were under a certain directory. so we used
find . -type f > ~defaulterImages.txt

Next we read this file into our java program. Then for each of the files, we figured out the correct name (newName) and the parent directory (we could derive it from filename) and then did this -

File aStartingDir = new File(aStartingDirStr);
File[] filesAndDirs = aStartingDir.listFiles();
if (filesAndDirs != null && filesAndDirs.length > 0) {
List filesDirs = Arrays.asList(filesAndDirs);
for (File file : filesDirs) {
String fileTraversed = file.toString();
file.renameTo(new File(newName));
}