Hi, Liao,
I suppose that you can modify your code as following to save instruction cycle time.
LDR R0, =(0x50003FFC); GPIO0DATA Base + 0x3FFC, address 0x5000 3FFC
loop:LDR R1, [R0];
MOVS R2, #(1<<3);
; Store the value of R1 into GPIO0DATA
EORS R1, R1, R2;
STR R1, [R0];
B loop
Because the R0 is not changed in the loop, so you can move loop label in the next line to save cycle time.
Question 1:
Let's say the first clock cycle is when the chip fetches the LDR R0, =(0x50003FFC) instruction. what the chip does in the following clock cycles? Also if there is any reference that could explain it, that will be really helpful.
>>>>>>The line LDR R0, =(0x50003FFC) is a macro, it is replaced with two movs instruction. As you know that the Cortex-M0 core instruction is word(32 bits) or half word(16 bits),because the operand 0x50003FFC is 32 bits so it is replaced by two moves instructions.
Question 2:
I find that the time between PIO0_3 is toggled every 15 cycles. However, based on the instruction set summary, it should be 11 cycles (LDR/STR takes two cycles and MOVS, EORS takes 1 cycle, B takes 3 cycles), anyone knows why? If there is a timing diagram to explain it, that would be great!
>>>>>I do not know where you get the cycle time of each instruction, anyway, because of pipeline, maybe the jump instruction B occupies multiple cycles. The Cortex-M0 is ARM IP, I suggest you ask ARM company directly, we have not detailed doc about the core.
Hope it can help you
BR
Xiangjun Rong