Sunday, August 2, 2015

Ralph's rant: non-portable AVR code

One thing I like about AVR MCUs is that in addition to instruction-set, a number of them have some degree of I/O register-level compatibility.  For example, both the ATtiny85  and ATtiny84a have PORTB at I/O address 0x18.  Because of this, I was able to write my 64-byte picoboot bootloader, which uses a soft UART on PB1, so that a single binary works on both the tiny85 and tiny84.

I recently though I could take advantage of the register-level compatibility between the ATmega328p and ATmega168p in my arduino compatible picoboot bootloader.  The source is already identical, and the only difference in the binary files is the flash start address and the signature bytes reported.  My idea was to build a version which returned the signature bytes of the 328p, but that loaded on the 168p.  When flashed to a 168p, it would look like a Uno to the Arduino IDE, so people could switch between a 328p board and a 168p board without having to modify the boards.txt file.  Obviously projects with a code size larger than 16Kb wouldn't work, but for everything else, I thought it was a great idea.  But it didn't work.

The bootloader would initially work OK; clicking Upload in the Arduino IDE would seem to upload the code to the 168p board when Uno was selected as the target, but the uploaded code wouldn't work.  I double-checked the fuse settings for the 168p.  I flashed the board back to the regular 168p bootloader, selected my modified pro mini 168 target in the boards menu, and uploaded.  Everything worked fine, so there was nothing wrong with the board.  I compared the disassembly of the normal 168p bootloader and my 168p masquerade bootloader as I was calling it; the only difference was the signature bytes reported.  I even reviewed the 168p/328p datasheet in case I missed an important difference - and found nothing.

I then decided to verify that the bootloader was properly flashing the uploaded code and hadn't somehow corrupted the flash.  I uploaded a basic blink program using the 168p masquerade bootloader, then connected a USBasp to read back the full contents of the flash, including the bootloader:
avrdude -c usbasp -C /etc/avrdude.conf -p m168p -U flash:r:flash168masq.hex:i

Then I used avr-objcopy to convert the hex file to elf:
avr-objcopy -I ihex flash168masq.hex -O elf32-avr flash168.elf

Finally, I used avr-objdump to disassemble the elf file:
avr-objdump -D flash168.elf

The reset vector was a jump to 0x00ae:
       0:       0c 94 57 00     jmp     0xae    ;  0xae
      ae:       11 24           eor     r1, r1
      b0:       1f be           out     0x3f, r1        ; 63
      b2:       cf ef           ldi     r28, 0xFF       ; 255
      b4:       d8 e0           ldi     r29, 0x08       ; 8
      b6:       de bf           out     0x3e, r29       ; 62

      b8:       cd bf           out     0x3d, r28       ; 61

The code at 0x00ae first clears the zero register (r1), then clears SREG(0x3f).  Clearing SREG is redundant since section 7.3.1 of the datahsheet shows that SREG is always cleared after reset.  Clearing it again wasn't going to cause any problems though.  The next four instructions initialize the stack (SPL and SPH).  I immediately recognized this as the problem.  I described how this was redundant in Trimming the fat from avr-gcc code.  In this case it wasn't redundant, it was wrong!  Since avr-gcc thought it was generating code for a m328p, it included the (normally just redundant) code to initialize the stack to 0x08FF.  But on the m168p, the end of RAM, and therefore the reset value of the stack pointer, is 0x04FF.  With an improperly initialized stack, it was obvious why programs uploaded to the 168p masquerading as a 328p weren't working.

So the superfluous code emitted by avr-gcc not only wastes space, it interferes with releasing binary code that runs on a number of different AVR MCUs.  I think it also demonstrates the dangers of developers writing code with a "it shouldn't hurt," attitude rather than a "is it necessary?" attitude.  I don't know who first said it, but it was a wise man that recognized when building a project you should include everything necessary but nothing more.

Tuesday, July 28, 2015

Externally clocking (and overclocking) AVR MCUs

People familiar with AVR boards such as Arduinos likely know most AVR MCUs can be clocked from an external crystal connected to 2 of the pins.  When the AVR does not need to run at a precise clock frequency, it is also common to clock them from the internal 8Mhz oscillator.  Before CPUs were made with internal oscillators or inverting amplifiers for external crystals, they were clocked by an external circuit.  Although you won't see many AVR projects doing this, every AVR I have used supports an external clock option.  One (extreme) example of a project using an external clock is Brad's Quark85 video game platform.  Some AVRs such as the tiny13a and the tiny88 do not support an external crystal, so the internal oscillator or an external clock circuit are the only options.  The 4-pin metal can pictured above is a clock circuit hermetically sealed for precision and stability.  They can be bought from Asian sources for under $2.

A common reason for needing an external clock for an AVR MCU is from accidentally setting the fuses for an external clock.  Once the fuses are set to external clock, they cannot be reprogrammed without providing an external clock signal.  Wiring the oscillator is simple; connect power and ground, then connect the output to the CLKI pin of the MCU.   On the ATtiny13a, this is pin 2 (PB3).  On the ATmega328-AU, this is pin 7 (PB6).

The output of the oscillators is very stable and accurate, around a few ppm, as measured by my Rigol scope.  The output is almost rail-to-rail (0-5V), and quite clean:

Although the connection is simple, it's not foolproof.  During my experimentation, I accidentally plugged my oscillator backwards (connecting 5V to Gnd and Gnd to 5V), which quickly fried it.  Now I'll be extra careful with the M-Tron 40Mhz oscillator so I don't kill that too!

AVRs are known for being easy to overclock, but I was uncertain whether an ATtiny13a rated for 20Mhz would work when overclocked to more than double it's rated speed.  I experienced no problems flashing code with avrdude and running my bit-bang UART at either 40 or 44.3Mhz with a 5V supply.  At 3.3V it crashed most of the time, only running OK occasionally.

Another way to provide an external clock is to build a ring oscillator using a 7404 hex inverter or similar chip.  A 3-stage ring oscillator I build using a 7404 generated a clock close to 30Mhz:

Since the frequency is inversely proportional to the number of stages, a 5-stage oscillator using the same 7404 would generate a frequency of 18Mhz.  I tried to make a single-stage oscillator with the 7404 and also with a 74LS00, but was unsuccessful,  They are just not fast enough to generate a 90Mhz clock.  Considering the 7404 I used is a Fairchild part with a 1984 date stamp, I'm pleased with how well this 30-year-old part works.

The last way of getting a clock source I'll describe is to tap off the XTAL pin of an AVR (or other MCU) that is using an external crystal.  Most AVRs can drive the external crystal in low-power of full-swing mode.  For the ATmega8a, the CKOPT fuse enables full swing mode.  If the AVR is driving the crystal in low-power mode, the peak-to-peak voltage will not be enough to work as the external clock for another AVR.  By soldering a wire to one of the XTAL pins you can use it to clock another MCU.  I've labeled the XTAL pins in yellow on a chinese USBasp clone:

And here's a shot from my scope connected to the 12Mhz crystal on the USBasp:

Finally, if your external clock is slower than 8Mhz (like if you were to use a 555 timer to generate the clock) you'll probably need to use a slower SPI bit clock setting with avrdude.  I've found avrdude -B 4, specifying a 4 microsecond clock period will work with AVRs clocked as low as 1Mhz.

Saturday, July 18, 2015

$3 USB gamepad teardown

I recently received a USB gamepad I ordered off Aliexpress for a little more than $3.  I got it for a RetroPie box I'm planning to build, so I don't need anything fancy.  A USB controller chip alone can easily cost $1, so I was curious to see what went into making these.  The photo shows it is pretty simple.

The PCB is single-sided bakelite, since it is really cheap.  While double-sided FR4 PCBs cost around 5c/sq in, even in volume, a single-sided bakelite board is under 2c/sq in.  The USB controller chip is on the other side of the board covered in an epoxy blob, so I can't say what kind of controller chip it is.  besides the controller chip, the only electronic components are the 6Mhz resononator and the ceramic capacitor.  The wires connecting the L/R buttons to the PCB are cheap - similar to the wires twist ties are made from.  The controller looks like it has good strain relief, with the cord winding around a few plastic posts.

The controller was detected (under Windows 7) as a HID-compliant game controller.  I haven't finished setting up my RetroPie box yet, so I tried it out with Doom.  The button feel wasn't the greatest, but all 12 of the buttons worked.  Overall, I'm satisfied with the controller considering the low price.

Tuesday, July 14, 2015

Rigol DS1054Z frequency counter accuracy

I recently found out that in addition to a software frequency measurement (shown in the bottom right) the DS1000Z series has a hardware frequency counter (shown in the top right).  The hardware counter is enabled by pressing the "measure" button, select counter, CH1.  The display shows 6 digits, or 1 ppm resolution, but I was unable to find a specified accuracy for the counter.  My testing suggests the accuracy at ~25C ambient temperature is 1-2ppm.

The first measurements I took were with a couple old metal can 4-pin oscillators I had salvaged.  One is a Kyocera 44.2368MHz that I measured at 44.2369MHz.  The second was a M-tron 40.000000MHz that I measured at 39.9999MHz.  The next thing I measured was a generic 12.000MHz crystal on a USB device which measured 12.0001MHz.   Together those measurements suggested an accuracy of <10ppm.  I don't have a high-precision clock source such as an oven-controlled crystal oscillator or GPS receiver with a timing output, so I needed another way to precisely measure the accuracy of the frequency counter.

My solution was to accurately measure the 1Khz test signal output from the scope since the frequency counter measured it at an exact 1.00000kHz.  I don't have access to a calibrated frequency such as a 5381A, but I do have Kasper Pedersen's nft software.  I connected the test posts for the 1kHz output to the Rx line on a USB-TTL module, and started up nft.

From the mode options I selected pulse at 1kHz.  I could tell the pulses were being detected because the "Events" count was going up by about 1000 per second.

I did a few 300s runs that gave an average error of -0.98ppm.  I then let it run for two 1000s tests which resulted in an average error of -1.68ppm.  I don't recall Dave's teardown identifying the timing source, but given the amount of error, I'd rule out an OCXO.   The accuracy is a bit better than the +-10ppm for a typical crystal oscillator, so maybe it uses a temperature-controlled crystal oscillator (TCXO).  If anyone knows for sure, drop a line in the comments.

In addition to testing accuracy, I tested the frequency range.  I probed the antenna output from a 433.92Mhz ASK/OOK transmitter.  The software frequency counter identified it as 435Mhz, but the hardware counter showed 66.1680Mhz.  The signal level was low (around 300mV), so that may have caused problems for the hardware counter.  I suspect it is good up to 100Mhz, which is more than I expect to need in the foreseeable future.  The accuracy and frequency range is sufficient for the things I want to do like checking oscillators on MCUs.  I found one of my $2 Chinese Pro Minis was oscillating at 15.9973Mhz.  The -169ppm error would be acceptable for a ceramic resonator, but this was with a HC-49S package crystal oscillator.

Wednesday, June 3, 2015

AVR eeprom debug log

Using text output for debugging is a common technique in both embedded and hosted environments.  In embedded environments the overhead of printf() or Wiring's Serial.print() can be quite large - over 1KB.  A lightweight transmit-only soft UART like my BBUart with some code to convert binary to hex will take 64 bytes, but on an 8-pin AVR, dedicating a pin to a soft UART may be a problem.  For some old parts like the ATtiny13a, the accuracy of the internal RC oscillator can also make UART output problematic.  I recently purchased 5 ATtiny13a's, and running at 3.3V, the oscillator for one of them was closer to 9.2Mhz than the nominal 9.6Mhz specified in the datasheet.

My solution is to use the EEPROM for debug logging.  The code takes only 22 bytes of flash, and the data log can be read using avrdude.  The eelog function will use up to 256 bytes of EEPROM as a circular log buffer.  Just include eelog.h, then make calls eelog() passing a byte to add to the log.  I wrote a test program which logs address 0x3F through 0x00 of the AVR I/O registers.  Then I used avrdude to save the EEPROM in hex form to a file:

Then the file can be viewed in a text editor.  Another option would be to save the EEPROM as a binary file and use a hex editor.  Here's the contents of the ee13.hex file:

From the log file, the value of stack pointer low (SPL) is 0x9D, or 2 bytes less than the end of RAM (0x9F).  Considering the call to main() uses 2 bytes, it looks like the eelog function is working as expected.

Monday, June 1, 2015

picobootSTK500 v1 release

I've just released v1.0 of my arduino-compatible picoboot bootloater.  It now includes support for EEPROM reads, and has been tested on an ATmega168p pro mini (the beta release was only tested on a 328p pro mini).  It also fixes a possible bug where the bootloader could hang while writing the non-read-while-write section of the flash.  Since it's been a few months since the beta release, which has been working well on a couple m328p modules, I've decided to bump the release to v1.

The bootloader only takes 224 bytes of flash space, so there's room left to add support for eeprom writes, and possibly auto-baud for the serial in the future.

Hex files for m168 and m328 are included in the github repo, and the Makefile includes a rule to use avrdude for flashing the bootloader.  If you are using the Arduino IDE with an ATmega328p, picoboot is drop-in compatible with the optiboot bootloader used on the Uno, so just select the Uno in the boards menu.  For the ATmega168, modify the board.txt file to support the faster upload speed and extra flash space:

Using picoboot increases the unused flash space by 12.5% compared to the stock bootloader on the ATmega168 boards such as the 168p Pro Mini.

Saturday, May 23, 2015

nRF24l01 control with 2 MCU pins using time-division duplexed SPI

Doing more with pin-limited MCUs seems to be a popular challenge, as my post nrf24l01+ control with 3 ATtiny85 pins is by far the most popular on my blog.  A couple months ago I had an idea of how to multiplex the MOSI and MISO pins, and got around to working on it over the past couple weeks.  The result is that I was able to control a nRF24l01+ module using just two pins on an ATtiny13a.  I also simplified my design for multiplexing the SCK and CSN lines so it uses just a resistor and capacitor.  Here's the circuit:

Starting with the top of the circuit, MOMI represents the AVR pin used for both input and output.  The circuit is simply a direct connection to the slave's MOSI (data in) pin, and a resistor to the MISO.  Since this is not a standard SPI configuration, I've written some bit-bang SPI code that works with the multiplexing circuit.  To read the data, the MOMI pin is simply set to input.  Before bringing SCK high, MOMI is set to output and the pin is set high or low according to the current data bit.  The 4.7k resistor keeps the slave from shorting out the output from the AVR if the AVR outputs high, or vice-verse.

Looking at the SCK/CSN multiplexing part of the circuit, I've removed the diode that was in the original version.  The purpose of the diode was to discharge the capacitor during the low portion of the SCK clock cycles, so the voltage on the CSN pin wouldn't move up in accordance with the typical 50% duty cycle of the SPI clock.  My bit-bang duplex SPI code is written so the clock duty cycle is less than 25%, keeping CSN from going high while data is being transmitted.  The values for C1 and R1 are not critical and are just based on what was within reach when I built the circuit; in fact I'd recommend lower values.  470Ohms * .22uF gives an RC time constant of 103uS, meaning SCK needs to be held low for >103uS for C1 to discharge enough for CSN to go low.  Something like a 220Ohm resistor and .1uF capacitor would reduce the delay required for CSN to go low to around 25uS.

The R2 is far more important.  The first value I tried was 1.5K, and after fixing a couple minor software bugs, it seemed to be working OK.  When I looked at the signals on my scope, I saw a problem:

The yellow trace shows the voltage level detected on the MOMI pin at the AVR.  Each successive high bit was a bit lower voltage, so after more than a few bytes of data, all the bits would likely be read as zero.  I suspect this has something to do with the internal capacitance of the output drivers on the nRF module, as well as it's somewhat weak drive strength, documented in the datasheet at table 13.  A 4.7K resistor seems to be optimal, though anything from 3.3K to 6.8K should work.


Here is the AVR code for the time-division duplexed SPI:
uint8_t spi_byte(uint8_t dataout)
    uint8_t datain, bits = 8;

        datain <<= 1;
        if(SPI_PIN & (1<<SPI_MOMI)) datain++;

        sbi (SPI_DDR, SPI_MOMI);        // output mode
        if (dataout & 0x80) sbi (SPI_PORT, SPI_MOMI);
        SPI_PIN = (1<<SPI_SCK);
        cbi (SPI_DDR, SPI_MOMI);        // input mode
        SPI_PIN = (1<<SPI_SCK);         // toggle SCK

        cbi (SPI_PORT, SPI_MOMI);
        dataout <<= 1;


    return datain;

I also wrote unidirectional spi_in and spi_out functions that work with the multiplexed MOSI/MISO.  Besides being faster than spi_byte, these functions work with the SE8R01 modules that have inconsistent drive strength on their MISO line.

The functions are in halfduplexspi.h, and I also wrote spitest.c, which will print the value of registers 15 through 0.  Here's a screen capture of the output from spitest.c: