I have ARM7 dev board (AT91SAM7S256 MCU from Atmel) that I've used in a few projects, though none of which was ever completed. It's a kind of testbed, for developing code that will be eventually run on a different CPU. One of the things I used this MCU for was SPI-based SD card support, for simple FAT16/32 library. And that in turn was a part of GD emulator project. Once I was happy with the code I moved all the software to Nios2 soft-CPU running on FPGA.
Recently though I decided to pursue the original idea, that is interfacing FPGA with a separate MCU. In this case the FPGA would only handle ATA/ATAPI stack and I2S protocol, plus data buffering through external SRAM chip. This could probably be done with a CPLD device but I already have an FPGA and the entry-level devices are cheap enough should I decide to stick with Cyclone II series. This way I can have logic analyzer (for debugging) in the project as well. So, I fetched the ARM box from the shelf, unpacked the kit, connected the power supply, and realized I can't do anything with it :)
As you know I'm using Windows 7 Pro x64 now, had it for a few months already. I quite like it, and the only thing that irks me is the lack of low-level parallel port support. I have software written for Windows 9x that requires direct I/O access, something that was quickly fixed with giveio.sys in 2000/XP but will no longer work on 64-bit OS. While I do realize that every piece of hardware and software will eventually become obsolete, and direct I/O access belongs to DOS era, I really miss that functionality. There are some solutions available but none of them is compatible with giveio. So I ended up with non-functional chip programmer (old piece of junk but it works and has some unique features that I use) and it also turned out that my Wiggler-clone JTAG adapter is now unusable as well.
Without JTAG adapter I can't update the ARM7 program. Sure, there is the SAMBA boot manager, but I don't like it. Running the code from RAM is faster but there is 128k of FLASH on this MCU and 64k of RAM, and I need as much RAM as I can get for buffers and structures involved in accessing filesystem on the SD card. Also, since I run the part at 48MHz, I need to add one wait cycle for FLASH memory access (it can go full-speed only up to 30MHz), so I want to be running from FLASH to have accurate representation of how fast the final code will be. Why not downclock the MCU to 30MHz? Well, internal SRAM can still work full-speed and this is important for SPI DMA, and I'm also planning on having USB uplink with a PC so I need clock rate to be 48MHz. I can clock the SPI bus at 24MHz (the limit for SD cards is 25), which is important too.
Anyway, turns out there are some cheap DIY USB JTAG adapters out there, based on FT2232 chips. I gotta say, FTDI did a good job there, as there are other USB serial port chips but surely theirs are the most popular. Also, cheap and easy to get too. This time though I was too lazy to solder everything myself, I figured I'd buy a clone of a well-known JTAG adapter and have it delivered. While putting together your own tools and hardware is fun, I really wanted to have a go at that ARM and quite frankly I wouldn't be able to make my own adpter cheaper than some 25 Euro - which is how much I paid. It's Amontec JTAGkey clone, does the voltage level shifting too though the range is a bit more limited, but 2.7-5V is enough for my needs.
First problem was the FT2232 in the adapter sports different than default PID, so FTDI drivers will not recognize it. Amontec haven't prepared a 64-bit driver it seems and also I didn't want to use theirs anyway, who knows how old it is and what bugs it has. Instead I modified INF files in newest FTDI driver package and added all the necessary entries to it, along with correct product names. Obviously this will not work well with the signature inside the driver binaries so Windows complained - but all I had to do was confirm and it installed properly. I've heard about some lengthy procedures, involving switching 64-bit Windows to some test mode and what not, to install unsigned drivers, but I didn't need that. With this approach I get to use one driver for all connected FTDI devices, I can always update it, and most importantly - it works perfectly.
Second problem was OpenOCD. While free to use - something I really like in my software - it's no longer possible find a precompiled binary for FTDI drivers. It seems that sometime in 2009 they figured that linking with 3rd party closed source library, even if dynamic, violates their GPLv2 license. Picture me doing a facepalm here. There are GPL replacements for USB libs but - figures - these don't work well with 64-bit Windows OSes. I know that license stuff is important, but why shoot yourself in the foot like that... One could argue it's Microsoft fault for requiring the drivers to be signed and disliking open-source software in general. That's just one side of the coin, though. The signing process is not exactly extremly difficult or expensive, and the authors could have just made an exception for the FTDI libs. It's their code after all. To me this looks more like a case of open-source people not liking Microsoft :)
Well, I can always compile the damn thing myself. My personal build can be linked to FTDI libs without violating GPL. It's just... the sources are missing Makefiles and the configuration script needs to be run in POSIX-like environment. Picture me doing another facepalm here. I'm kinda alergic to Cygwin and the various bits and pieces of informations I found suggested that OOCD can be natively built on Windows, so I got creative. I installed MinGW64 - well, unpacked really. Once you have several compilers in the system, each targeting a different CPU, you need to be careful of what ends up in the PATH variable :) For ARM development I use devkitPro, which is also the first choice for Nintendo DS homebrew as it comes with nice system libs, and it sport MSYS environment. I married that MSYS with MinGW paths and finally got OpenOCD to configure and compile itself. And hey, it works too :) I only wish this method was at least mentioned somewhere in the docs, because README and INSTALL are pretty much targeted for *nix platforms only - again, maybe it's just me, but it sure looks like someone doesn't like Microsoft.
Third problem was my ancient OOCD configuration scripts didn't work with version 0.4.0 and I had to rewrite that stuff. I strongly recommend you read the docs here, it's much easier once you realize you can reuse the ready-made scripts and example files. At first I copy-pasted some of the stuff I googled but ended up with a pretty big script I didn't understand much. Then I found the TCL folder in OOCD sources and most of the configuration data I needed was already there :) Long story short, my script is now working and I can program the ARM7 FLASH memory with my code.
The whole purpose of this endavour was to try out a few ideas I had, one of them being running MCU in Thumb mode rather than ARM all the time, which should improve code execution speed. Another one was to try multiple block reads. Single reads are easier but each time you issue a request the card will waste about 1-2ms preparing itself for the transfer. Multi-read obviously works only for continuous memory areas but there is next to no delay between each sector, so unless the files are very fragmented this will result in a significant speedup. There is a small issue of having to properly stop the multi-read sequence if the the address you want is not the next one (and it will happen every now and then since FAT structures need to be read and processed as well) but I got that worked out. The end result for a sequence of 1000 sectors:
- single block reads: 1100kB/s
- multible block read: 2500kB/s
Effective file read speed (12MB file on FAT16) was well over 1800kB/s. With these figures I am finally back on target for x4-x12 CD read spead that real GD-ROM can do.
- Music:Vibrasphere - Archipelago