A while back, I received a Comsol 8Gb USB Flash Stick for a test. As it turns out, I’ve managed to grab another, so I felt less bad about breaking one apart to work out what’s inside – and as it turns out, it provided me a world of entertainment for the weekend. It was more than I expected, and the optimization process is something engineers (like myself) really get excited about.
Teardown and Experiments
I’m sad to say that the teardown of this product is a relatively destructive process. The main PCB itself is held inside a thin aluminium “tube” by two glued-in translucent plastic end-pieces. Cutting the pieces and twisting them out with pliers was the only method to free the internal PCB, which revealed some interesting results.
The PCB is marked C20V-2.0-AU6989-L48-2L-TF-1.5 and is dated 26th June 2014, making this a relatively recent design. A provision is made for a front LED, which is unused, instead in preference of a rear LED. A provision is made for a crystal oscillator, but this is unpopulated because this uses an Alcor Micro AU6989SN-GT controller with integrated crystal.
Features, excerpted from usbdev.ru, are:
– PCBs are pin compatible with AU69XX USB2.0 series
– Integrated build-in Regulator
– Integrated build-in Crystal with Alcor’s patent
– Supports 72bit/1K BCH ECC engines
– Supports new generation MLC/TLC flash
– Supports Toggle/ONFI DDR flash
– Not support the flash ECC requirement under 24bit/1K
– Well performance in TLC DDR flash
– Improved read performance reach 32MB/Sec
– Integrates hardware DMA engine to tune up the operation performance
– Works with default driver under the environments of Windows ME, Windows 2000, Windows XP, Vista, Window7, Windows 8, Mac 9.2, Mac OS 10.x. Using Alcor Micro’s vendor driver for the environment under Windows 98SE.
– Low power operation with SDR/DDR flash
– Supports software write protection
– Support Auto Run function
– Support erasable and read-only mode AP Disk
– Companion application program with UFD – iStar available for users
– – To have UFD partition management function
– – To do password protection for the security in data access
– – To guard data files with software write protection function
– – To lock up PC by UFD as the key
– Available in 48-pin LQFP 7x7mm / TQFP_7x7mm / QFN_6x6mm / QFN_7x7mm package to support 4CE pin flashx2pcs
– Available in 64-pin LQFP 7x7mm / TQFP_7x7mm package to support 4CE pin flashx4pcs
It’s a very feature rich controller, but that’s what you would expect given the competitors (e.g. Phison) which offer similar features in the value segment.
The rear offers a big surprise. The flash is a soldered down single microSDHC card! The card isn’t marked with a manufacturer but does have some etching on it. A space for a surface mount crystal is seen underneath as well.
I know what you’re thinking – this is just a card reader and microSDHC combination, minus the connector, but you’d be wrong. The controller is a flash controller, and desoldering reveals an even more interesting outcome. Out comes the hot air gun!
Only the power connections seem to be used on the microSDHC’s pin, and instead, pads are used to talk to the card. A total of 21 pin connections are made, but I’m not sure what they are.
Maybe playing with the microSDHC card would allow us to read the data from it, offering a simple way to recover the data from the card – turns out this doesn’t work. The card is identified by my reader as a 24/32Mb card and does not read properly. It appears unformatted. The read speed is 10.8Mb/s.
I also decided to go one step further and try to extract the CID and CSD information:
CID: 035344534430333280ffffffff0062c5 CSD: 00260032515981e9bef9cfff92404053
The CID and CSD provide some very unusual information. The CID gives the manufacturer ID as 0x03, which is Sandisk. The application ID of 0x5344 is also a common Sandisk trait. The Product name is SD032 with revision 8.0, which would suggest 32Mb (or Gb). Serial number seems to be unset (all oxFF’s), with date set as Feburary 2006. This is paired up with a version 1 CSD, with device size of 1958, equaling 32,096,256 bytes – so 32MiB.
As a result, it looks like this is a Sandisk part, with an SD controller, but it hasn’t gone through the final manufacturer certification and formatting procedure. Instead, it is integrated in a product using the pads on the underside as a raw NAND package, ignoring the on-board SD controller, and thus takes the format of the controller talking to the raw NAND.
How can we prove it’s a Sandisk part? Well, here’s some photos of a similar Sandisk part, as a microSDHC card, sold under the Sandisk brand.
Notice how the etched numbers match the same font-spacing style? Of course the branding hasn’t been printed and an extra etching has been made on the spine, which may be due to special binning (C grade?). The underside also features the pads, but covered by some thick paint of some sort. Now I know what those pads are for!
Getting out the hot air gun again, I was able to resolder the package onto the board without damaging it – it still works just fine. So why was this exciting? It’s because, just like the Phison USB keys I had played with earlier prior to blogging, the manufacturer’s tools are available (albeit though shady channels) to the public which allows for some interesting insight into the manufacturing and optimization process.
The software for certifying and manufacturing these USB keys is called AlcorMP. An archive of versions of the software is available from usbdev.ru. The software is capable of running in English and Chinese, but the user guide is in Chinese only, so I’ve pretty much done my exploration without much help from it.
To work properly with this particular USB key, which uses a very late controller, you must use AlcorMP Version 15.03.05.00 or newer. I tried a Version 14 with no luck, due to the flash not being supported. By default, it will run with administrative rights so it can hook a special driver to talk to the keys and reprogram them. The program has a large set of accompanying .bin files, which seem to contain code to run on the controller itself, which makes this controller a possible safety issue, as noted for Phison controllers in the BadUSB exploit. The fact the controller can be so easily reprogrammed is a boon for those making fake flash keys.
These files perform the low-level format, testing and provide the firmware for the controller as well as some special partition tools for those who wish to use the iStar features.
This is like an engineer’s playground – a tool made by engineers for engineers. The main screen of the tool looks like this, and at the moment, it is performing a low-level format. Normally, inserting the key will have it show up its flash ID and other information, and would load “sensible” processing settings based on the processing that was done on the unit at the factory.
Looking for the configuration details of the Comsol gives us the following data:
The drive is formatted with a fixed capacity of 7450Mb and has firmware version 1600. The VID and PID seem to be customized as well, but the vendor/product strings are empty, which explains the no-name nature of the drive in the HDTune Pro tests prior.
To begin the certification process, you will need to click on the Setup button. This brings up a password prompt, of which no entry is required – just click OK.
The first screen is a relatively cluttered one but it sets up the flash configuration. The flash is auto-detected from the ID bytes as a Sandisk SDTNRCAMA-008G. Leaving the number, channels, and cycle time at defaults seems to be sufficient for this drive.
The manufacturing process can be changed to optimize the drive for Speed, or for Capacity. There is a third option which seems to be for a high level format based on pre-existing bad block marked by the NAND manufacturer. You should really use either Capacity or Speed as the bad block data from the manufacturer is probably damaged.
A low level format will test the flash and make sure the flash is usable. This is desirable if you have a fake flash drive to determine its correct capacity. The check mode is in LLF Check, with several levels of thoroughness. I chose Disturb Check because this seems to be more thorough, and checks for adjacent flash cell disturbance, whereas the other tests don’t address this (but can be sufficient). I also selected half-cap check after to ensure a thorough test. The scan level can have an impact on robustness, with Full Scans taking longer but ensuring all the flash is tested – Full Scan 4 is most thorough as far as I know. The ECC level can be set between 0 to 15 – this was one setting which caught my eye and I looked to understand and optimize it further in the next part.
The special flash section is there to deal with particular types of flash with quirkiness/compatibility issues, and should be left at Normal unless you have particular errors during manufacturing.
The advanced button brings up a new dialogue with more features –
The ECC enhance level feature is normally off, but can be turned on to improve stability of the result. Low level format revise can increase the scan time by running the low level format loop a few times to improve stability, but is generally unnecessary, as the remaining ECC should be able to handle any marginal blocks that may have passed.
Pattern controls the test pattern used in low level formatting, and driving level configures the signal drive for the chips. The MaxL1fCE seems to do with flash chip enables, and should be left at the default setting. I’m not sure what LC Offset is used for. Sync Mode may be indicative of the flash interface being run in synchronous mode – but I’m not entirely sure.
Strengthen the stability should be left to default – enabling this brings up a warning that capacity and speed will be sacrificed for stability, which is probably not necessary for normal usage.
Use Block Mode controls which blocks are used. This can be changed to odd or even blocks for “salvaging” bad flash chips. Cache program can be enabled or disabled, but I’m not sure what this actually does – so I did try to see if it does anything in a later part.
The next screen across allows you to select the mode which the drive appears as – whether it’s a removable disk, fixed disk, read-only, password protected, or U3-style CD-ROM. You can pre-set some formatting parameters for the drives and the images to be pre-loaded. The LED behaviour can be customized as well. The use of the U3 style CD-ROM can be used to turn the drive into a USB CD-ROM drive for installing OSes which don’t understand USB installation (e.g. Windows XP), or for storing things read-only (without using the other read-only features of the drive)
The information tab allows you to set the VID, PID and strings – so you can customize the “name” of the drive in Device Manager. That can be a pretty good party trick.
The bad block configuration screen gives you the opportunity to set how the flash is configured. Auto check optimizes the drive size based on how much flash is actually workable. This can be potentially dangerous, as it may not leave any spare blocks for replacement should blocks fail during “runtime“. That being said, I’m not sure the Alcor Micro is capable of doing dynamic block replacements.
Dynamic Set leaves some blocks for reserve, whereas bin allows the system to automatically optimize with one target or another, and decide which “regular” capacity to allocate a drive to based on the workable flash. This is interesting as it implies there could be very odd-sized flash keys out there – maybe a 4Gb key that has 6.8Gb of workable flash?
I’d have to say that the majority of manufacturers probably don’t use this mode, and use the fix capacity mode instead – either a drive passes or fails to provide a set capacity, and that’s the end of the day (7450Mb in case of this model). The final mode sets a fixed number of blocks as a percentage as “bad”.
Other settings include the format file system (you don’t get a choice really), and options which help in real production usage (up to 32 simultaneous drives qualified using the same machine). There is one interesting ATTO Optimize feature which suggests there are a few tweaks to make the drive benchmark better. The drive can be formatted with MBR or as VFAT (which isn’t reliably bootable, but gives a tiny bit more space). The Enable Reader feature allows other chipsets with integrated SD readers to have the slot “usable”, and MaxMPTime allows for production to be limited to a certain amount of time or fail.
The other page allows you to customize flash, and do multiple loop burn-in tests. The power bits can be adjusted to make it more acceptable to the end user’s requirements. I have no idea what AutoH2 does, but the write log option provides a quick listing in a text file of the results of drive optimization.
The UI Show features are really only useful for those in production environments if they want to standardize on a particular look or colour coding – the home user can live with the default.
Speed/Capacity, and ECC Optimization
The controller itself advertises support for 72 bit/1K BCH ECC, and no less than 24 bit/1K ECC. The setting for ECC being from 0 to 16 was a little puzzling, so I tried to consult the manual with some digital translation help.
The translated sections for ECC using Google Translate reads as follows:
FLASH poor quality need to be open for FLASH bad block ECC error correction can improve certain capacity, but
There may be some risks. ECC = 0 most stringent low grid FLASH out the most stable; ECC = 15 most relaxed,
Capacity may be larger, but there may be some risks.
The original low-grid setting value refers to a low-level format ECC on the selected use.
ECC tuning levels: Level 1-4, may be appropriate to increase the capacity of FLASH, the proposed selection level 1.
Low grid correction: low grid ECC scan times can make more accurate, but it takes a little more time, check only takes effect.
Scan times: You can manually set the number of low grid scanning, you can make a more accurate scan, but it takes a little more time,
Check only takes effect.
Patten: Patten can choose different scans, mainly for the more special flash.
Use Block Mode: manually choose to do the entire block or block or even-odd block.
Cache Program: Open or closed manually select cache program command.
Using Bing Translator gives me a very similar result:
Low quality FLASH needs to open up FLASH bad block by ECC error correction, guaranteed capacity can be improved, but
There may be a certain amount of risk. ECC=0 is the most strict, low FLASH the most stability; ECC=15 is the most relaxed,
Capacity may be larger, but there may be a certain degree of risk.
Low setting refers to the use of a low-level format on the chosen ECC values.
ECC tuning level: level 1-4, may be appropriate to improve the capacity of the FLASH, choose level 1.
Correction: low several times makes the ECC scanning is more accurate, but will spend more time, check the do not take effect.
Scan frequency: low the number of scans that can be manually set, can make the scan is more accurate, but will spend more time,
Check the do not take effect.
Patten: you can select a different scan Patten, mainly for very special Flash.
Use Block Mode: manually choose to do the full block or even block or odd block.
Cache Program: choose to turn on or turn off the cache manually program command.
As a result, it seems that the ECC setting sets the tolerance to block errors during flash low level formatting. To verify this, I decided to run a low level format at every setting (taking an hour each setting) for both capacity and speed optimize. ECC Tuning was disabled. Random pattern was used during testing (resulting in slightly random variances in formatted size), and Disturb test with Full Scan 4 was used. The results were as follows:
The results seem to follow the description to some extent with some surprises. For one, the optimize for capacity option did not significantly outperform optimize for speed, and performed worse at low ECC level (strict). The capacity of most ECC levels above 3 were fairly similar, around 7800MB+, which is about 350-450Mb more than the fixed capacity it was shipped with. The number of bad blocks identified varies somewhat from ECC level to ECC level, probably due to random pattern test variation in detecting errors, coupled with some potential wear-out and alignment differences during test.
In general, it seems like the ECC level represents the number of bad bits tolerated in a flash block/page during low level formatting before the block/page is marked bad and taken out of use. Hence, lower levels are stricter, and higher levels would allow for more defective bits in the low level format (leaving less margin for wear in the future, making it less stable).
Because of the quality of the flash, the capacity is maximised even with fairly strict levels of ~3. Higher levels don’t seem to restore much more capacity, which implies that the bad blocks must contain bursts of bad cells which are uncorrectable even with wider bad-bits tolerance.
It’s probably best to certify for a fixed size a bit smaller, if you want to give some room for reallocations (provided the controller actually supports it).
So, what’s the cost of Capacity optimization versus Speed? Well, as it turns out, the speed is much better in speed mode – it’s about 84% faster in read and 44% faster in write than the shipped status. Part of the improvement seems to be new firmware – as the drive identifies with firmware version 8E8A using the AlcorMP tool.
No more nasty 13.78Mb/s read, and 3.27Mb/s write! It’s still no speed demon, but given the Sandisk card I had with the same sort of shape was a Class 4 card, the performance is probably the best we can expect from the flash. I did change the write-cycle time to the minimum setting, hoping to “push” the flash faster, but it made no difference.
ATTO Optimization and Cache Program
One other option intrigued me, and that was optimization for ATTO, a commonly used dish benchmark. Was this some sort of cheating, or a deliberate bias towards small block accesses at the expense of sequential access? Was there any tradeoff?
The other wonder was what the Cache Program option meant. Did this mean there would be an pSLC cache on the drive or some sort of attempt at optimizing for small transfers? Or was it just an option to cache production firmware on the drives themselves? Did it have any performance impact?
Capacity Optimized, No ATTO Optimization
Speed Optimized, No ATTO Optimization
On the whole, with no ATTO optimization, aside from slight variations, it seems that the optimize for performance selection provides better performance across the board.
Speed Optimized, ATTO Optimization ON
Turning on ATTO Optimization doesn’t seem to do much at all. There’s a few slight increases across the board, but I suppose this flash chip isn’t particularly capable and neither is the single channel design.
Speed Optimized, ATTO Optimization ON, Cache ON
Turning on cache didn’t make a lick of difference either, so I might as well leave that at default. It wasn’t the option I hoped it was.
I also used H2testw and CrystalDiskMark as benchmarks to find out just how far we’ve improved from the baseline case of “as shipped” and the impact of each of the options. It seems that optimize for speed gives a good boost overall, and ATTO optimization provides a slight advantage. Cache didn’t have much of a notable impact, except for a strange reduction in 512kB writes in CDM and a slight increase in H2testw writes.
My finalized settings including ECC Enhance Level 1, with LLF Revise and Dynamic Set (4) was tested as well on the rightmost column, as my “final” optimized result. It doesn’t seem that these options, which affect the LLF process, actually affects the final drive performance beyond that of normal “test to test variation”. Not bad for a “cost free” solution.
Of note was that the drive was stable enough to pass H2testw in the fresh state even at varying format capacities depending on the run. This doesn’t indicate the drive will be stable in the future, as the flash wears, hence my recommendation to use stricter ECC (than the default 8 used by this manufacturer) and dynamic set to maximise the storage and improve the “safety” margin.
I even tried being very negligent – doing a manufacture run at ECC 15, with ECC Enhance at 4 (loose), speed optimize, with half-capacity scan, quick-scan 1 selected, and auto size for bad blocks and the resulting unit still passed the H2testw test, however, might be unstable as flash cells wear out. It didn’t yield me any significant capacity gain, and that’s likely due to burst-error accumulation in manufactured flash.
The quest for optimization seems to be something in most engineer’s hearts. It’s one reason why I do love overclocking. Playing with a manufacturer’s tool is like overclocking a USB key – performance and capacity was both improved at no cost! It turns a nasty key into one that more closely resembles its advertised “up to 20MB/s” and makes it tolerable for use. Of course, it may not be as stable as otherwise, depending on the options you choose – but now you’re in the driver’s seat and that’s pretty cool.
Of course, none of this is without risk, and it’s equally possible to ruin a good drive, brick it or otherwise make the performance or stability worse. You can and will void warranties as well – do all of this at your own risk.
In the end, I opted for ECC level 4, plus ECC Enhance Level 1 and LLF Revise ON, cache default and chose to go with Dynamic Set (4) for bad blocks to give some margin for reallocation. That should give me some more capacity, with the stability you would expect.