Deunan
Recent Entries 
10th-Apr-2013 05:07 pm - Condition red
Spot differences between two pictures:



The first one was taken on system with Catalyst 13.1 WHQL and the second one on 13.2 beta. Day wasted on fixing this particular "bug". And before someone points out "This is what you get for using beta drivers" - there is a reason I need to be on 13.2 or newer.
29th-Jan-2013 12:47 am - A little help
Dave Murphy from devkitPro.org is trying to raise some money for his project. Please go visit this web page and see if you can help.

devkitPro is a great dev platform for all Nintendo DS homebrew, and also other consoles that I'm not really that familiar with: GBA, GP32, PSP, GC, Wii. It's a very nice set of free tools and libraries for Windows OS, and we should support it because there aren't really all that many.

In fact I'm using it myself - for ARM-related stuff, including GDEMU project. The VMS for NDS was built using devkitPro for example. There is some work being done to have SH4 family support in there as well, so that should also be interesting to Dreamcast homebrew community.
26th-Jan-2013 06:36 pm - GPU computing
Thanks to an article on SemiAccurate I learned about new AMD gadget called, wait for it, Gizmo. You can see it here.
Gizmo was most likely inspired by the success of Raspberry Pi as dev boards like this existed before but were never this cheap, or even available to general public. Let's compare it to RPi and Intel products then :P

- It's a PC (even runs Windows 7 since that's how AMD guys measured the silicon temperatures under stress).

First, it needs 3V "button" lithium battery, which is mandatory but apparently not part of the kit. In fact it has to be a battery with wires and a small plug, like in laptops, so forget about buying it in TESCO. Tsk.
Then you'll need a SATA hard drive, or SSD, so again forget about cheap SD cards. I suppose a CF card with IDE-to-SATA interface might do the trick if you don't need performance.
Lastly it will obviously need more power than a USB phone charger can provide, much more. The good news is it will accept anything from 9 to 24 volts, so it can be run on 12V lead-acid (car) battery for example.

So, compared to RPi it's not really that great for small projects. It's on par with some of the Intel N-series Atom ITX boards, like D945GSEJT or DN2800MT. Its form factor places it somewhere between RPi and ITX.

- It needs cooling (unless used in a lab environment).

While the board is all passive-cooled it's clearly stated in the docs that this is just enough for 25C ambient temperature and only without a case. If you want to put it in a case or use at higher temperatures you'll need to add a fan to the CPU radiator. There is a fan connector on the PCB for that purpose, though I would've like bigger heatsinks.
The CPU itself is rated at 6.4W but there's also the companion chip (the "south bridge") to consider. The VRM section is also going to generate some heat but I assume it can deal with it in most situations.

Again, a win for RPi and possibly also for the two Atom boards I mentioned since these will work in a case if there is enough convection present. I've seen fanless cases for these Atoms boards so it can be done. Obviously though it depends a lot on where that case will be put :) It might work in an air-conditioned room but not otherwise in summer heat. It's not black and white here.

- It's an APU.

And now we're talking. It's not that much smaller than ITX board and possibly runs hotter so does it have any good sides to it? Yup, the computing power available.
It's a dual-core fully out-of-order AMD64 architecture CPU clocked at 1GHz. That might not look very impressive compared to 1.86GHz N2800 Atom, which is also 64-bit capable and dual-core, with Hyper-Threading to boot, but Atoms are in-order architecture. Turns out it's difficult to make code that would not choke in-order CPUs so much. The compilers are to blame although some code (semi-random branching for example) is just not predictable enough to properly optimize.
The APU is not just CPU though, it's also the GPU next to it. Radeon HD 6250 in this particular case, with 80 shaders clocked at 280MHz.

So why exactly is a measly mobile GPU, the lowest of all AMD has to offer, that much of a win? Because its 80 shaders equal to 1 compute unit (CU), and you can do other stuff with it than just drive VGA output.

To make a point here I've run some tests. My code was trying to brute-force crack M4-type encryption key from dumped NAOMI data. These keys are only 32 bit long and the encryption algorithm is not even that complicated once you see it - again, thanks to Andreas Naive for making "obvious" things actually obvious to us, mere mortals :)
I wrote a cracker in C that, given a key, will decode 8 bytes of data and compare it with known pattern to check for match. To scan entire key space you need to run this code 4294967296 times. A typical, simple approach would be to create a cracking procedure that takes a key value as an argument and then make a loop that will call this procedure 2^32 times, checking the result. Here's how long it takes:

* Intel Core2 Duo E6600
- 1 core @ 2400MHz (2nd core not used)
- full out-of-order architecture
- Windows 7 Professional 64-bit
- 64-bit code (MinGW64 4.5.3 -O2)
+ 415s

* AMD Athlon XP processor 1700+
- 1 core @ 1466.909MHz
- full out-of-order architecture
- Debian Linux, 2.6.32 kernel
- 32-bit code (gcc 4.4.5 -O2)
+ 937s

* Intel Atom N270
- 1 core @ 1596.095MHz (HT not used)
- in-order architecture
- Debian Linux, 2.6.32 kernel
- 32-bit code (gcc 4.4.5 -O2)
+ 2799s

* Raspberry Pi ARM11
- 1 core @ 900MHz (O/C, core @ 450MHz, SDRAM @ 450MHz)
- ARMv6 architecture
- Raspbian Linux, 3.2.27 kernel
- 32-bit code (gcc 4.6.3 -O2)
+ 3378s

As you can see it takes some time, and the in-order Atom and RPi ARM are especially bad at it. And my RPi is running overclocked, the typical values are 700MHz for CPU, 250MHz for core and 400MHz for SDRAM so in reality it's even worse. Obviously you don't want to run crackers on your small dev board but what if this was face/shape recognition based on images from small camera on a robot? That does seem like a plausible use case.

Now there's this stuff called OpenCL which lets you distribute your computation-heavy tasks over multiple CPU cores, and also GPU compute units. I used the same cracker, except the main loop was thrown out and replaced by OCL framework. Here's how it went:

* Intel Core2 Duo E6600
- 2 cores @ 2400MHz
- full out-of-order architecture
- Windows 7 Professional 64-bit
- OpenCl code (AMD APP 2.6)
+ 105s

* AMD/ATI Radeon HD 5770
- 10 compute units @ 850MHz
- GPU architecture
- Windows 7 Professional 64-bit
- OpenCl code (AMD APP 2.6)
+ 6s

Yeah, that's whole 6 seconds. Not all code gets that much of a boost on GPU, this one was integer based with some logic operations but didn't have many branches in it. Even the CPU version got twice as fast as simple C code, most likely due to aggresive compiler optimizations - most loops had just 4 passes so it's a great place to unroll and use SSE2 vectorization.

Now, my 5770 has 10CUs clocked at 850MHz so in total 8500PU - "power units". It run for 6 seconds so it needed 51000PUs to complete the task. The 6250 has only 1CU at 280MHz so 280PUs total. 51000/280=182 seconds. In reality probably a bit more due to slower data transfers. Compare that to Atom results and you'll see why having that GPU is important :)
With dual-core CPU you can easily run a lot of data processing and offload the really heavy stuff to GPU, so it appears to be a great dev board for more advanced projects.

Now why did I bother with this long-winded explanation? Well, it looks like AMD has got all three next-gen consoles in the bag. We've had a lot of "insider leaks" lately, most of it is wishful thinking taken for gospel, especially when it comes to fanboys. Silly people. It's not about raw power anymore. Consoles will not be able to beat PCs with the numbers, not unless you want them to draw 1kW of power and cost the same as rack full of servers. It's about being smart with what limited resources you have. One can argue that's always been the case but this generation will show it even more. A typical PC that can run games in 1080p in 3D at 60fps would need some 300-400 Watts of power. Next gen consoles are promising the same level of fidelity (well, we shall see about that I guess) at half that power. This is what I find most interesting. I couldn't care less if the CPUs are 1.8 or 3.2GHz and how may gigabytes of RAM there are inside.

BTW, I've made some additonal calculations. My RPi runs on 5V and draws 0.5A so it used up 5V * 0,5A * 3378s = 8445Ws to get the calculations done. My Radeon 5770 has 108W TDP so let's assume I actually hit that, and that the rest of my PC drew 150W, which is VERY safe assumption as CPU was idle and so were the HDDs. (108W + 150W) * 6s = 1548Ws. So not only it was faster but also used less power :) Nice things, these compute units. With 16 thousands 128-bit wide registers it's no wonder each takes so much silicon space.
18th-Dec-2012 03:18 pm - End of the world
I might not belive that the world will end this December but two of my PCs decided not to wait and commited suicide.

My netbook died first, about a month ago, one day simply didn't turn on and that was it. No amount of messing with its internals would help. It was an old hand-me-down with Atom N270 that I got for free because of failed HDD. I replaced it, reinstalled OS and kept using it for a year or so. It had Win XP, 1024x600 matte screen, 1G of RAM and the battery would hold for about 2 hours - which was good enough for my needs. Hell, it flew with me around Europe a few times. I wasn't using it much at home so I don't need to replace it right away but I sure miss it.

Yesterday another N270 gave up the ghost, this time it was my Linux system that I keep running 24/7 for various purposes: router for my private LAN, WiFi AP, FTP/NFS server, and most importantly my dev machine for Dreamcast and NAOMI since I keep my cross-compiler tools there. I liked this board too, it was all-passive cooled and required only 12V input from a brick-type PSU so there were no fans at all. I think the BGA balls cracked because I would get random reboots lately and last week the system would not boot up until it has cooled down to room temperature. Eventually even that stopped working and now it will reboot randomly within 10 seconds of powering up, cold or not. So, right now half of my flat has no Internet and I need to fix that ASAP.

I ordered a new board, it's another Atom (N2800 this time) since I really want to keep the energy usage down to bare minimum and I don't need a lot of CPU power. Even N270 could easily deal with 100Mb/s traffic on both NICs while streaming from HDD, and it was 2.5W rated. Yeah, I know, it doesn't include the north bridge which was doing most of the job connecting all system components together :) So N2800 might be 6.5W but I expect NM10 to have improved over the old 945 (and GPU is now part of the CPU as well). I was also interested in AMD Brazos family but those chips are much more powerful and require active cooling, and I don't need Radeon HD in a headless PC. The good news is the new board will also be powered by single 12V so at least I get to keep the PSU - hopefully. I already had to buy a new memory stick (DDR3 now instead of DDR2), a new low-profile NIC (no PCI slot, just one PCI-E x1), and a new N-capable WiFi card (miniPCI-E). Well, at least my netbook HDD is going to be reused :P

There is one more old PC that I have, and obviously my main one that is not very old but it has its years. I swear, one of them dies in the next few weeks and I'm buying a replacement and calling it Apollo 13. In the meantime I started doing more frequent backups.

Anyway, so what's up with the GDEMU project. Well, there is progress but I've hit some problems - as usual. I came up with new logic for the FPGA and it works perfectly (so far) between MCU and FPGA but fails on the GD bus. And I have no idea why, I've tried pretty much everything by now, except adding some pull-ups to control lines but I don't expect this to help much. Doesn't look like an electrical problem.
The prototype works when FPGA is clocked within a very specific frequency range, but not really otherwise. BIOS loads the game, I get to see the first screen or so and then it dies because DMA goes completly out of sync - I still have tons of data in the buffer but the console expects to see end-of-DMA interrupt already. So obviously I'm missing a lot of read requests but I don't know why. Must be another race condition that I can't figure out. So, why not let it run at the frequency it works? Because the problem is still there, just not as obvious. It's not stable either way and you wouldn't want your game to freeze 3 hours in and who knows how long since last save, right?

To combat that I've finally gave in and bought USB based JTAG programmer for Altera FPGAs. Those things are costly but I found a cheap clone that should work nice. I expect it to arrive in a few days. With live JTAG uplink I will be able to transfer new settings directly rather than have to swap SD cards as I do now, and more importantly I'll be able to run a logic analyzer to see what is going on.

The world will most likely not end but thanks to all those troubles (and GoG discounts :) my bank account balance just might.



EDIT: Looks like it could be electrical issue after all. Well, I'm going to rip the Dreamcast apart now and solder some proper wires for ground return path. Lets see what that does.

Oh, and here's a photo:

2012-12-19 GD-EMU proto test

As you can see I got the JTAG unit today and I'm fresh out of USB ports on the hub :)



EDIT 2: Apparently one year was not enough to add proper idle support for Cedar Trail Atoms to Linux kernel. Not even the bleeding edge 3.7.1. If you run dmesg |grep intel_idle you'll see this:

intel_idle: does not run on family 6 model 54

So, if you're in this situation as well and you don't mind compiling kernel from sources, try this hack:

1) Locate drivers/idle/intel_idle.c in the source tree
2) Make a backup copy just in case :)
3) Edit the file, find "intel_idle_ids" table
4) Add "ICPU(0x36, idle_cpu_atom)," line to it, but keep it sorted by model code

Compile and install the modules and kernel. Reboot. Enjoy.

Now, I'm not saying this is the proper way of doing it but 20mA less current draw from PSU (at 12.2V) says it's working. I haven't seen any nasty side effects yet.
3rd-Sep-2012 05:35 pm - Genesis contd.
And behold, it was very good...

Okay folks, since there are so many questions about the GD-EMU project and noone can be bothered to read the answers from the time I showed you my first iteration of the idea, here it is all again:

1) Ready when?
No idea. I would not be making a custom PCB and ordering new parts and working on it if I didn't belive it can be done, but at the same time I cannot (and will not) make any promises about delivery dates. Obviously though if I can't make it work as I'd like in the next few months it's going to be shelved again.

2) How much?
Again, no idea. In fact it's not even decided I will be selling those. If it doesn't seem like I can turn a profit without investing all my free time into it, I'll just stop at prototype phase. While I understand that it would upset many of you, I'm not a charity worker. It's one thing to code a free application and share it with the world and quite another manufacturing a hardware device for sale.

All I can say right now is the prototype is pretty expensive (compared to a price of a working, pre-owned Dreamcast). But that is true for all prototypes. Things get considerably cheaper when mass-produced. Then again it's quite possible the first batches will still be priced higher because of low volume of sales - I'm sure as hell not going to invest my own money into this.

3) Kickstarter? Preorders?
While Kickstarter seems like a good option, it's a no-no because I'm not a US resident. End of story right there. I will also not take any kind of preorders (or other money offers) until I'm certain the device will work and can be manufactured in suitable quantities. Things get serious when money are involved and I'm a rather cautious person.

4) Features?
It will be a 100% compatible replacement for GD-ROM drive, except using SD cards. It might offer better loading times but otherwise will function in the same way. It's meant to provide a backup solution for the laser and other mechanical parts of the drive which are no longer in production and fail after so many years of use. While many of you will interpret this last sentence as "it will play game rips" I'd like to point out that I never condoned software piracy. I think I made my point clear when I refused to fix any bugs in Makaron that were related to CDI rips of the games (as opposed to proper GDI images). Many of these "bugs" were actually how the rips worked on a real console, although these could be somewhat helped if I wanted to. But I didn't. So, if you are/were a Dreamcast user then you should be familiar with region locks, video cable restrictions, bootable (or not) homebrew, etc. Using GD-EMU will not remove/help with any of these. You might try image patching, sure, but I will not give any support for these modifications if there are any problems.

As for user interface - I like simple things that work as expected. I've seen too many projects that looked nice but didn't deliver what was promised in the first place. My goals are perfect compatibility and stability. Anything else is extra. I think 2 buttons is enough to select which game on the card should be "inserted".
If that's not enough for you, code a good Dreamcast app that will select games from the card - it can be put as the first image on it, which will boot by default. Then we can talk about how to make the hardware do what the app/user wants.

5) USB link to PC?
That's in plans, but no work has been done yet. I'm not even sure the USB port on the prototype works properly :) So, eventually yes, but probably not from the start. USB host support (as in USB HDDs and FLASH drives) is probably not going to happen. Did I mention I like simple solutions?

6) Other features?
Well, if it ever happens that I make tons of profit on these things, which I doubt, I might reconsider my stance on UI, USB host, and other things. But that would have to be a considerable amount of money to motivate me :)

7) Open source?
Highly unlikely. If only because some people could just take all my work and start selling their own devices. While I'm not stopping anyone from creating a different/better project, they better be prepared to spend as much time on it as I have. I've already helped many people by sharing important bits and pieces of info, and even programs made by me. There is goodwill and there is stupidity - and I have to say that more often than not I've came to regret my decisions. Once burned...

8) Pics or it didn't happen.
There are photos of my all-FPGA approach on this blog, and even some short movies on YT of it working (with minor issues) if you know where to look. I will post pictures of the V2 prototype connected once it actually does work. I'm redoing much of my FPGA code and this might take some time as I want to try another approach.
30th-Aug-2012 02:44 pm - Genesis
Let there be light:
2012-08-30 GD-EMU proto V2 #1

And there was light:
2012-08-30 GD-EMU proto V2 #2

More pictures to follow soon :)

Status so far:
Voltage regulators - check
MCU starts - check
Bootloader operational using 3V3 UART - check
MCU JTAG - check
C runtime stub + simple exception handling - check
Status LEDs - check
UART 115200 8N1 console - check
Interrupts - check (need to investigate if registers are really properly saved though)
External RAM - check (problem found, should be fixed now)
High speed SD interface - in progress

Random fun fact: Many SD/SDHC cards exhibit various little quirks in SPI mode so the code needs to be aware of those to work properly in every case. One would think the native SD protocol is so tightly standardized that there should be no such surprises. Well, I just found a bunch of 2GB Kingston SDs that respond to ACMD41 with bad CRC7...

EDIT: Turns out the R3 answer is the only one not protected by CRC7, that space is marked as reserved and just filled with all-ones. I'm still not getting the busy bit within reasonable times on these Kingstons but I suppose reading the docs few more times might teach me something new again.

Anyway, here's the actual thing:
2012-08-30 GD-EMU proto V2 #3

Now it's a proper prototype, with all these wires and blinking LEDs. A few things are still missing on the PCB but right now I need to get SD protocol working so I can fetch FPGA configuration image and test it.

EDIT:

High speed SD interface - check
DMA on SD i/f - check
Basic FAT support - check
FPGA - in progress

I'm using my own FAT library, which has no write support but it was designed to be fast while consuming as little RAM as possible. In fact current SD cards are so fast it makes sector buffering impractical, since the lookups and LRU queues kill any gains with additional overhead. I suppose it'd be different if the CPU was clocked above some 400MHz and had some fast L1 cache.
Right now I get average of ~10MB/s in test that seeks to random part of 1.2GB file and reads 1-3500 consecutive 2352-byte long chunks. This is to simulate RAW image reads for GD-ROM. So pretty well I'd say, a nice boost compared to 2.5MB/s I got over SPI.

The native SD interface required a pretty much complete rewrite of some code, so I'm not 100% sure it's stable and all, but seems to work for hours without problems so far.
23rd-Jun-2012 02:56 am - With a piece of wire
SDR in 5 easy steps:

1) Design your radio

2012-06-23 DR2A PCB

2) Build it

2012-06-23 DR2A working

3) Get control software

2012-06-23 HDSDR @ 7880kHz

4) Get decoder software

2012-06-23 MULTIPSK HF FAX

5) Amaze your friends

2012-06-23 Weather Fax 7880kHz


Considering I got this far with a 2m piece of wire hanged across my window, I'd say it's a success :)
4th-Jun-2012 01:06 pm - My name is Bond
Disclaimer: This entry is not emulator related. If you're here only for Makaron news, please ignore it.

I found a very interesting paper a few days ago, here's a link: Breakthrough silicon scanning discovers backdoor in military chip

This is just a brief summary of a side attack on Actel/Microsemi ProASIC3 chips. The power analysis idea alone is mighty interesting - for such a big silicon structure as FPGA it seems impossible at first, but then it's not the whole chip that has been put under scrutiny. It's just the part responsible for JTAG and some of the internal addressing structrure, but even then measuring the subtle current fluctuations is not exactly trivial. While simple on paper, this task requires a number of fast and well-calibrated ADCs and most importantly a software capable of processing the collected data. I could probably come up with a good enough electrical component of such testbed but that software part is like magic to me - and I'm no stranger to programming :) Ah well, I suppose it's better for me to accept that certain math problems are way above my head, less stress that way (I guess ignorance IS a bliss after all).
Anyway, ProASIC3 series have been well designed, with countermeasures in place to thwart this type of attacks. The paper mentions fuzzy clock sources and very low leakage transistors, as well as carefull structure design - all these factors contribute greatly to the chip's security. It is, after all, touted the most secure FPGA available on the market today. But there's a twist: apparently the chip has some undocumented features that allow one to read it's configuration and internal memories even after it has been secured with secret key.

You see, when it comes to JTAG it's pretty much normal that most (if not all) of the protocol commands intended for factory testing are kept secret. This is to protect the manufacturer secrets, nothing strange to it. However, it's one thing when the design has, say, additional structures to improve yields and some of them are disabled even if fully operational in order to keep all chips with the same specs. It's a dirty little trick but in the end acceptable. It's quite another story with intentional backdoors placed into the design.

Hidden features like these are handy for the people who make the product, in case there is a problem that needs to be fixed after the production has been completed. It doesn't have to be a hardware thing - remember the first X360 DVD drives? Those could be reprogrammed if you just knew how to do it, sometimes even via the SATA link so there wasn't even any need to open the cover. This was a firmware backdoor and once discovered it was ultimately used enable the drives to accept recordable media as original games. I seriously doubt Microsoft had any knowledge of this prior to discovery by hackers, this was most likely something the drive manufacturer added on it's own to be able to reuse returned/repaired drives. This 'cost saving' in company A caused company B much grief and money loss.

So, the big question: Was it Actel that put this 'feature' in the design, or was this added by the Chinese factory that actually made the chips? Either way this will have far reaching consequences as ProASIC3 series is often used in military equipment. So I wonder if it's just a blunder or a some sort of espionage attempt. Actel does claim that their products are secure and the contents of the chip cannot be easily recovered as there is no way of doing so. Clearly, that is a lie. Obviously the company now says that there isn't any backdoor but you can't prove that something doesn't exist. On the other hand, the researchers will need to verify their claim by showing a succesful attack attempt to the public. So I guess we wait and see what happens. If it turns out to be true Actel's whole CPLD/FPGA branch could be finished.

In other, somewhat related news: I've taken some interest in software defined radios. I have to say that just by researching the subject my understanding of modern signal processing technology has gone up quite a bit. The best SDRs out there use FPGAs for signal processing - ADC interface, sampling rates, decimation, digital filtering, etc. A proper design can cover a big chunk of frequency spectrum while sampling at many Ms/s. The best way to go about it is to use quadrature demodulators to produce I/Q signals. These can be then freely processed PC-side (as long as you have CPU power to do so) to obtain anything from AM radio audio to satellite imagery. Problem is, these toys are costly :) I'm not about to throw 3-4k$ at my 'new hobby' so the only other options are:

1) Self-made QAM mixer/demodulator that uses audio frequency output. This can be plugged into modern PC soundcard and sampling rates of 96kHz or even 192kHz are avaiable, at a very nice 16-bit resolution. Some cards can even go as high as 24 bits, though in reality the SNR is probably not going to exceed some 20-22 bits even in best of conditions. I've actually designed a PCB based on YU1LM DR2A design, which is pretty simple and anyone can make it cheaply since there aren't any hard to come by parts. It needs external VFO at 4x the local oscillator frequency but I can make that too. Worst case scenario I'll buy a cheap DDS kit based on AD9850 or something like that.

Here's what the PCB will look like:
2012-06-05 DR2A PCB

2) For some 20$ you can buy a USB DVB-T dongle with Realtek RTL2832 chip inside. That chip model is important, apparently it can be put into a 'dumb' mode where it samples and passes the I/Q data unmodified to the PC. With a proper tuner attached it will cover 80-1700MHz range, with some holes though, and the SNR depends on the frequency. The best tuner seems to be Elonics E4000 but other ones (FC0012/FC0013 for example) will also work, though with more holes so less frequency coverage. This mode was meant for software processing of FM and DAB/DAB+ radio signals, and it seems to be undocumented feature not present in other designs. RTL2832 can only provide 8-bit data but it makes up for that in the bandwidth of 1Ms/s up to 3.2Ms/s. The best setting is an integer multiple of the on-board resonator, usually 28.8MHz, so that there are no sample drops. So, while far from perfect it's a great bargain for 20 bucks.

And the best thing, even though it's called a radio, is that such setup is pretty much a PC-based spectrum analyzer. These things are quite expensive as standalone hardware so it'll be interesting to see if a self-made project like this can be useful for something other than listening to police radio traffic :)
12th-Mar-2012 02:40 pm - Still cranking
I didn't get into finer details of the low-level CD dump in my previous post because it would get really long and boring - some people however took that as a lack of proper understanding of the complexity of such project on my part. Rest assured I did give it a lot of thought :)

I'm fully aware that a raw bitstream from the optical pickup contains many bit glitches, as EFM is far from perfect. After all this is what we have CIRC for. If the drive logic can deal with those glitches, so can we. The only difference is the drive will correct (where possible) the bad symbols in the stream and produce a correct output, whereas we want to come up with a clean input stream that would not even need correcting. This is more problematic but still doable.

What is a bit glitch? It's exactly as it sounds, a single bit having wrong value (1 instead of 0, or the opposite) somewhere in the stream. It's usually just one bit per byte, and doesn't happen very often, but more serious read errors happen as well from time to time. How often is "not very" you might wonder. Well, let's analyze Ranma dump I've made.

To keep things a bit less messy I will only use tracks 3-55 of the dump here, all of which are audio. 257060 audio sections to be precise, for a somewhat round number. Since each section is accompanied by exactly one Q sub packet, we have exactly 257060 of those as well. Knowing that this particular subchannel has a CRC we can easily check the integrity of all of them. And it turns out 151 are glitched.

In case you're not clear on why we stick to subchannels and not actual data, it's because subs are not protected by CIRC so a dump will not be repaired on the fly. Or at least should not be, some drives can actually do that for P & Q subs. It's not all that difficult (might be a bit slow though) - knowing that it's usually one bit that goes wrong we just keep flipping all the 96 bits in the Q packet and recalculate the CRC. Eventually we find the right one and have repaired the stream. Now, I do realize that we could end up with a bad repair and CRC collision but that is very unlikely for a single bit change. Also, keep in mind that it might be the CRC area that has the glitched bit :)

Sure enough, out of those 151 bad packets 150 were fixed by flipping one bit. One case was two bits going wrong next to each other, so it could be a fluke, or a case of a "walking bit" problem. These are also easily fixed but require another pass over the data and more complicated algorithm. In the end all packets were repaired though. CIRC is more complicated in principle but also designed for fixing the errors, not just detecting them. This is why we could have a dump that has many glitches but can be used since it auto-repairs itself when processed properly. But, obviously, we want to have clean one - and the question is how do we obtain one.

We could just repair the image after the dump is done, that will probably fix most of the glitches. Some might be tricky enough to warrant a partial redump to make sure, but that could be automated, we could even do two passes in all cases to have more material to work with. Unless the disc is damaged and keeps returning errors for one particular area, the glitches are otherwise pretty random, caused mostly by various harmonic frequencies in the rotation speed introduced by the spindle motor. Pickup positioning and focus is also not perfect and subject to both vibrations and thermal noise in the coil drivers.

Ranma, by the way, is also not exactly an example of good CD mastering. The data LBA to Q sub offset is maybe not as big as on Lady Phantom but it's there. Also, there are mode-2 Q sub packets present on the disc, even if empty, so it makes one wonder why didn't they delete them altogether if Catalog Number was not recorded. I'm mentioning that because mode-2 Q does not carry addressing information except the fraction field.

And, just to put things into perspective here, since you migth think that 151 glitches per (pretty much) a whole disc is not a big deal: This was just the Q subchannel data. There are 2352 bytes in the sector (when dumped raw or as audio section) and another 96 in the subchannels (98 actually but 2 bytes are used internally as a sync field, and not reported). A raw sector plus sub packet, when rejoined together, is 2448 bytes total. The Q subchannel alone is just 12 bytes. Since we got 151 errors by analyzing just 12 bytes out of 2448, then it means the total number of glitches would be about 30 thousands for the whole dump. And that is what I call a problem, and why we don't have a low-level dumper yet.
2nd-Mar-2012 11:31 pm - Cranking old engines
Lookit what goodies came in the mail today:



There is a reason why I'm showing you Lady Phantom - this is one of the somewhat problematic CD-ROMs for PCE. First, the TOC:

-> Track 01  (Audio, 00:47:65, LBA: 0 / 00:02:00)
-> Track 02  (Mode 1, LBA: 3590 / 00:49:65)
-> Track 03  (Audio, 02:34:63, LBA: 5157 / 01:10:57)
-> Track 04  (Audio, 02:23:59, LBA: 16770 / 03:45:45)
-> Track 05  (Audio, 02:48:09, LBA: 27554 / 06:09:29)
-> Track 06  (Audio, 02:30:08, LBA: 40163 / 08:57:38)
-> Track 07  (Audio, 02:48:02, LBA: 51421 / 11:27:46)
-> Track 08  (Audio, 02:27:37, LBA: 64023 / 14:15:48)
-> Track 09  (Audio, 00:33:26, LBA: 75085 / 16:43:10)
-> Track 10  (Audio, 00:09:50, LBA: 77586 / 17:16:36)
-> Track 11  (Audio, 00:33:54, LBA: 78311 / 17:26:11)
-> Track 12  (Audio, 02:13:59, LBA: 80840 / 17:59:65)
-> Track 13  (Mode 1, LBA: 90874 / 20:13:49)
-> Track 14  (Audio, 02:14:25, LBA: 91550 / 20:22:50)
-> Track 15  (Mode 1, LBA: 101625 / 22:37:00)
-> Track 16  (Audio, 02:55:64, LBA: 102325 / 22:46:25)
-> Track 17  (Mode 1, LBA: 115514 / 25:42:14)
-> Track 18  (Audio, 02:05:73, LBA: 116214 / 25:51:39)
-> Track 19  (Mode 1, LBA: 125662 / 27:57:37)
-> Track 20  (Audio, 01:54:28, LBA: 126384 / 28:07:09)
-> Track 21  (Mode 1, LBA: 134962 / 30:01:37)
-> Track 22  (Audio, 01:43:38, LBA: 135678 / 30:11:03)
-> Track 23  (Mode 1, LBA: 143441 / 31:54:41)
-> Track 24  (Audio, 02:30:03, LBA: 144149 / 32:03:74)
-> Track 25  (Mode 1, LBA: 155402 / 34:34:02)
-> Track 26  (Audio, 01:52:62, LBA: 156088 / 34:43:13)
-> Track 27  (Mode 1, LBA: 164550 / 36:36:00)
-> Track 28  (Audio, 02:09:10, LBA: 165266 / 36:45:41)
-> Track 29  (Mode 1, LBA: 174951 / 38:54:51)
-> Track 30  (Audio, 00:37:56, LBA: 175575 / 39:03:00)
-> Track 31  (Mode 1, LBA: 178406 / 39:40:56)
-> Track 32  (Audio, 01:21:72, LBA: 179070 / 39:49:45)
-> Track 33  (Mode 1, LBA: 185217 / 41:11:42)
-> Track 34  (Audio, 01:43:24, LBA: 185889 / 41:20:39)
-> Track 35  (Mode 1, LBA: 193638 / 43:03:63)
-> Track 36  (Audio, 01:11:69, LBA: 194338 / 43:13:13)
-> Track 37  (Mode 1, LBA: 199732 / 44:25:07)
-> Track 38  (Audio, 01:31:39, LBA: 200368 / 44:33:43)
-> Track 39  (Mode 1, LBA: 207232 / 46:05:07)
-> Track 40  (Audio, 03:42:69, LBA: 208632 / 46:23:57)
-> Track 41  (Audio, 01:01:60, LBA: 225351 / 50:06:51)
-> Track 42  (Mode 1, LBA: 229986 / 51:08:36)
-> Track 43  (Audio, 04:11:57, LBA: 230654 / 51:17:29)
-> Track 44  (Mode 1, LBA: 249536 / 55:29:11)
-> LeadOut  (LBA: 250814 / 55:46:14)

It's very rare for any disc to have so many data and audio tracks mixed together. PCE doesn't use ISO9660 so the devs just split the game data as they saw fit.
If this was a well-mastered Yellow Book compliant CD-ROM then we'd just rip the tracks, ignore pre/postgaps and it'd be a complete, working dump. It's not so easy in this case. Let's analyze the transition from track 14 to track 15.

-> Track 14  (Audio, 02:14:25, LBA: 91550 / 20:22:50)
-> Track 15  (Mode 1, LBA: 101625 / 22:37:00)

The last sections of track 14 are silent, this is normal except for cases where you don't want gaps in audio. Since the next track is data it's a good idea to do this properly. Except, you know, the very last sectors are filled with junk here :)

LBA  :  101472

0000 :  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................  <- silence
SUBC :  01 14 01 02 12 22 00 22  34 72 00 00 00 00 00 00  ....."."4r......

LBA  :  101473

0000 :  6A 82 AF 21 BC 18 71 CA  A4 57 3B 7E 93 60 6D E8  j..!..q..W;~.`m.  <- junk
SUBC :  01 14 01 02 12 23 00 22  34 73 00 00 00 00 00 00  .....#."4s......

LBA  :  101474

0000 :  6A 82 AF 21 BC 18 71 CA  A4 57 3B 7E 93 60 6D E8  j..!..q..W;~.`m.  <- more junk
SUBC :  01 14 01 02 12 24 00 22  34 74 00 00 00 00 00 00  .....$."4t......

Notice how Q sub properly identifies this as audio track 14, index 1, time 22:34:74 - which is (22*60 + 34)*75 + 74 = 101624. With the usual TOC offset of 150 we get LBA 101474. Figures.
Also, TOC says track 14 should be 2:14:25 long and sure enough this is the last audio section, 2:12:24, according to Q (these counters start at zero).
Now, since there should be a 2-second long pregap for a data track that follows an audio track, and the data track starts at LBA 101625, so that means the first Mode 1 sector should be at LBA 101625 - 2*75 = 101475:

LBA  :  101475

0000 :  00 FF FF FF FF FF FF FF  FF FF FF 00 22 35 00 01  ............"5..  <- Mode 1 header, first sector of a pregap
SUBC :  01 14 01 02 12 23 00 22  34 73 00 00 00 00 00 00  .....#."4s......  <- Q address goes back!

And we have a Mode 1 header! Except Q sub still says it's audio track #14-1 :) Also, data header address is 22:35:00 and Q has 22:34:73.
It doesn't take a genius to notice that 22:34:73 was also the first audio sector with "junk" instead of proper silence. Gee, I wonder what would happen if we extracted that junk and run it through descrambler. Yup. It is, in fact, the same sector we are looking at now and the header is 00 FF FF FF FF FF FF FF FF FF FF 00 22 35 00 01.
So... two last sections of track 14 and two first sectors of track 15 are overlapping. Depending which track you're reading, the drive will return either raw bytes for audio or descrambled sectors for data. In fact, it's even worse:

0000 :  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
0010 :  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
0020 :  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
...
0440 :  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ................
0450 :  00 00 00 00 00 00 00 00  00 00 00 00 00 FF FF FF  ................
0460 :  FF FF FF FF FF FF FF 00  23 B5 00 61 00 28 00 1E  ........#..a.(..
0470 :  80 08 60 06 A8 02 FE 81  80 60 60 28 28 1E 9E 88  ..`......``((...
0480 :  68 66 AE AA FC 7F 01 E0  00 48 00 36 80 16 E0 0E  hf.......H.6....
0490 :  C8 04 56 83 7E E1 E0 48  48 36 B6 96 F6 EE C6 CC  ..V.~..HH6......

The data sector 22:35:00 starts in the middle of audio section 22:34:72. Which is perfectly OK. Except it cannot be properly ripped when the drive already cooks the raw data into sectors - since you can sync data to headers but there is no way to sync the audio.
And this is why I said it's not always possible to rip a CD-ROM track by track, even using subcodes as a reference. It only works for discs where audio sections match data sectors perfectly and even then it's down to drive logic to figure out how to join subchannel data with post-CIRC stream. So, claiming that any dump is a perfect copy of the master image is just... silly.

Let's see what happens next with our dump:

LBA  :  101476

0000 :  00 FF FF FF FF FF FF FF  FF FF FF 00 22 35 01 01  ............"5..
SUBC :  01 14 01 02 12 24 00 22  34 74 00 00 00 00 00 00  .....$."4t......  <- note that Q still says track type is audio

LBA  :  101477

0000 :  00 FF FF FF FF FF FF FF  FF FF FF 00 22 35 02 01  ............"5..  <- M1 address: 22:35:02
SUBC :  41 15 00 00 01 74 00 22  35 00 00 00 00 00 00 00  A....t."5.......  <- Q address: 22:35:00 (index 0, so a pregap)

LBA  :  101478

0000 :  00 FF FF FF FF FF FF FF  FF FF FF 00 22 35 03 01  ............"5..
0920 :  00 00 00 00 00 00 00 00  00 00 AD 90 29 C9 83 16  ............)...
SUBC :  41 15 00 00 01 73 00 22  35 01 00 00 00 00 00 00  A....s."5.......

LBA  :  101625

0000 :  00 FF FF FF FF FF FF FF  FF FF FF 00 22 37 00 01  ............"7..  <- First sector with actual data
SUBC :  41 15 00 00 00 01 00 22  36 73 00 00 00 00 00 00  A......"6s......  <- Q still says pregap

LBA  :  101626

0000 :  00 FF FF FF FF FF FF FF  FF FF FF 00 22 37 01 01  ............"7..
SUBC :  41 15 00 00 00 00 00 22  36 74 00 00 00 00 00 00  A......"6t......

LBA  :  101627

0000 :  00 FF FF FF FF FF FF FF  FF FF FF 00 22 37 02 01  ............"7..
SUBC :  41 15 01 00 00 00 00 22  37 00 00 00 00 00 00 00  A......"7.......  <- Q finally switches to index 1

Q never matches the header address and in fact even the track type is wrong for some sectors. But don't assume that this 2-section offset is constant for all the audio/data tracks on the disc. While it could be the case, it's not guaranteed. The spec allows up to +/- 1 second offset so the difference could be anywhere from zero to 150 and, depending on what the drive logic thinks is the start of audio data, the header position might end up in different part of the audio section. In fact it's allowed for the header to start well inside a frame, as long as it's on 4-byte boundary.

So... all these frames, sections, offsets, sectors, tracks and gaps start to confuse you? Well, keep in mind the CD, while digital in nature, was meant to carry audio. Data sectors were added later and kinda slapped onto existing audio layer. This let the manufacturers reuse their designs by just adding header detection and descrambling hardware (along with some RAM for data buffering). But thanks to that we have this mess on our hands now :)
This page was loaded May 18th 2013, 2:23 pm GMT.