I am now working with the LPC1114 which utilizes the ARM CORTEX M0 architecture. I have one question about the instruction set summary of the ARMv6M Thumb instruction set. I want to know what the processor does during each single clock cycle for each instruction. For example, if I have the following code where I want to write something to the GPIO0DATA register to change the level of the IO output (Actually toggle PIO0_3) 
loop 
LDR R0, =(0x50003FFC); GPIO0DATA Base + 0x3FFC, address 0x5000 3FFC 
LDR R1, [R0]; 
MOVS R2, #(1<<3); 
; Store the value of R1 into GPIO0DATA 
EORS R1, R1, R2; 
STR R1, [R0]; 
B loop 
Question 1: 
Let's say the first clock cycle is when the chip fetches the LDR R0, =(0x50003FFC) instruction.  what the chip does in the following clock cycles? Also if there is any reference that could explain it, that will be really helpful. 
Question 2: 
I find that the time between PIO0_3 is toggled every 15 cycles. However, based on the instruction set summary, it should be 11 cycles (LDR/STR takes two cycles and MOVS, EORS takes 1 cycle, B takes 3 cycles), anyone knows why? If there is a timing diagram to explain it, that would be great! 
 
					
				
		
 xiangjun_rong
		
			xiangjun_rong
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hi, Liao,
I suppose that you can modify your code as following to save instruction cycle time.
LDR R0, =(0x50003FFC); GPIO0DATA Base + 0x3FFC, address 0x5000 3FFC 
loop:LDR R1, [R0]; 
MOVS R2, #(1<<3); 
; Store the value of R1 into GPIO0DATA 
EORS R1, R1, R2; 
STR R1, [R0]; 
B loop 
Because the R0 is not changed in the loop, so you can move loop label in the next line to save cycle time.
Question 1: 
Let's say the first clock cycle is when the chip fetches the LDR R0, =(0x50003FFC) instruction.  what the chip does in the following clock cycles? Also if there is any reference that could explain it, that will be really helpful. 
>>>>>>The line LDR R0, =(0x50003FFC) is a macro, it is replaced with two movs instruction. As you know that the Cortex-M0 core instruction is word(32 bits) or half word(16 bits),because the operand 0x50003FFC is 32 bits so it is replaced by two moves instructions.
Question 2: 
I find that the time between PIO0_3 is toggled every 15 cycles. However, based on the instruction set summary, it should be 11 cycles (LDR/STR takes two cycles and MOVS, EORS takes 1 cycle, B takes 3 cycles), anyone knows why? If there is a timing diagram to explain it, that would be great! 
>>>>>I do not know where you get the cycle time of each instruction, anyway, because of pipeline, maybe the jump instruction B occupies multiple cycles. The Cortex-M0 is ARM IP, I suggest you ask ARM company directly, we have not detailed doc about the core.
Hope it can help you
BR
Xiangjun Rong
Hi, Rong,
You mentioned that "The line LDR R0, =(0x50003FFC) is a macro, it is
replaced with two movs instruction", Could you please explain what the two
movs instructions are?
Thanks,
Haohao.
xiangjun.rong <admin@community.nxp.com> 于2018年11月28日周三 上午1:03写道:
NXP Community
<https://community.freescale.com/resources/statics/1000/35400-NXP-Community-Email-banner-600x75.jpg>
Re: clock cycle activity for LPC1114
reply from xiangjun.rong
<https://community.nxp.com/people/xiangjun.rong?et=watches.email.thread>
in LPC Microcontrollers - View the full discussion
<https://community.nxp.com/message/1085134?commentID=1085134&et=watches.email.thread#comment-1085134>
 
					
				
		
 xiangjun_rong
		
			xiangjun_rong
		
		
		
		
		
		
		
		
	
			
		
		
			
					
		Hi, Liao,
Anyway, the LDR Rd=constant is a pseudo-instruction, maybe different compiler deal with in different way.
This is the part I find in DUI0801E_armasm_user_guide.pdf
Hope it can help you
BR
XiangJun Rong
6.7 Load immediate values using LDR Rd, =const
The LDR Rd,=const pseudo-instruction generates the most efficient single instruction to load any 32-bit
number.
You can use this pseudo-instruction to generate constants that are out of range of the MOV and MVN
instructions.
The LDR pseudo-instruction generates the most efficient single instruction for the specified immediate
value:
• If the immediate value can be constructed with a single MOV or MVN instruction, the assembler
generates the appropriate instruction.
• If the immediate value cannot be constructed with a single MOV or MVN instruction, the assembler:
— Places the value in a literal pool (a portion of memory embedded in the code to hold constant
values).
— Generates an LDR instruction with a PC-relative address that reads the constant from the literal
pool.
For example:
LDR rn, [pc, #offset to literal pool]
; load register n with one word
; from the address [pc + offset]
You must ensure that there is a literal pool within range of the LDR instruction generated by the
assembler.
Hi, Rong,
Many thanks for your help! I really appreciate it!
xiangjun.rong <admin@community.nxp.com> 于2018年11月29日周四 上午2:35写道:
NXP Community
<https://community.freescale.com/resources/statics/1000/35400-NXP-Community-Email-banner-600x75.jpg>
Re: clock cycle activity for LPC1114
reply from xiangjun.rong
<https://community.nxp.com/people/xiangjun.rong?et=watches.email.thread>
in LPC Microcontrollers - View the full discussion
<https://community.nxp.com/message/1085904?commentID=1085904&et=watches.email.thread#comment-1085904>
One reason for code executing more slowly than you expect is that the code is executing out of Flash, which can introduce wait-states.
