Technical discussions about the TI C55x DSPs (including the c5501, c5502, c5503, c5507, c5509, c5510 and OMAP5910).
Hello, I'm working on a 5509a DSP. I trying t optimize a critical function as much as possible. Analyzing the code showed that the most time is spent in while calling the std::abs() function in a for loop. Is any of you aware of a optimized version of this function or is there any fast algorithm to compute this ? By consulting the assembler users guide seems that the 55x architecture has an dedicated assembler instruction for computing the abs value but I don't know how to mix C/C++ instructions. Thanks, Marko
Marko- > I'm working on a 5509a DSP. I trying t optimize a critical function as > much as possible. Analyzing the code showed that the most > time is spent in while calling the std::abs() function in a for loop. > > Is any of you aware of a optimized version of this function or is there > any fast algorithm to compute this ? > > By consulting the assembler users guide seems that the 55x architecture > has an dedicated assembler instruction for computing the abs value but I > don't know how to mix C/C++ instructions. Calling asm from C/C++ code is easy and well-documented in the TI C/C++ guide docs. But one thing you can try as an intermediate step is intrinsics, which allow C statements to be replaced by inline statements that expand into specific asm instructions. Try here: http://focus.ti.com/lit/ug/spru281f/spru281f.pdf Section 6.5.4, Using Intrinsics to Access Assembly Language Statements, shows some abs() intrinsics. -Jeff
Hi Jeff,
I have already tried intrinsics but the _abs() doesn't work for me
("corelations.cpp", line 1035: error: identifier "_abs" is undefined).
Am I missing something ?
In general I prefer not to stick with the assembler on the 55x due to
the silicon issues.
Thanks,
Marko
Jeff Brower wrote:
> Marko-
>
>
>> I'm working on a 5509a DSP. I trying t optimize a critical function as
>> much as possible. Analyzing the code showed that the most
>> time is spent in while calling the std::abs() function in a for loop.
>>
>> Is any of you aware of a optimized version of this function or is there
>> any fast algorithm to compute this ?
>>
>> By consulting the assembler users guide seems that the 55x architecture
>> has an dedicated assembler instruction for computing the abs value but I
>> don't know how to mix C/C++ instructions.
>>
>
> Calling asm from C/C++ code is easy and well-documented in the TI C/C++ guide docs.
> But one thing you can try as an intermediate step is intrinsics, which allow C
> statements to be replaced by inline statements that expand into specific asm
> instructions. Try here:
>
> http://focus.ti.com/lit/ug/spru281f/spru281f.pdf
>
> Section 6.5.4, Using Intrinsics to Access Assembly Language Statements, shows some
> abs() intrinsics.
>
> -Jeff
>
Marko-
> I have already tried intrinsics but the _abs() doesn't work for me
> ("corelations.cpp", line 1035: error: identifier "_abs" is
undefined).
> Am I missing something ?
As I recall, SPRU281f mentions _abss, _labss, etc. as the intrinsic names. The
trailing 's' means a saturated result.
> In general I prefer not to stick with the assembler on the 55x due to
> the silicon issues.
That doesn't make any sense. In the first place, it's rare that CCS allows entry of
silicon version in the target setup, and in those cases where it does both the
compiler and assembler would use the information to avoid relevant silicon errata.
Whether the compiler generates the asm instruction or you generate it isn't going to
matter.
-Jeff
> Jeff Brower wrote:
> > Marko-
> >
> >
> >> I'm working on a 5509a DSP. I trying t optimize a critical function as
> >> much as possible. Analyzing the code showed that the most
> >> time is spent in while calling the std::abs() function in a for loop.
> >>
> >> Is any of you aware of a optimized version of this function or is there
> >> any fast algorithm to compute this ?
> >>
> >> By consulting the assembler users guide seems that the 55x architecture
> >> has an dedicated assembler instruction for computing the abs value but I
> >> don't know how to mix C/C++ instructions.
> >>
> >
> > Calling asm from C/C++ code is easy and well-documented in the TI C/C++ guide docs.
> > But one thing you can try as an intermediate step is intrinsics, which allow C
> > statements to be replaced by inline statements that expand into specific asm
> > instructions. Try here:
> >
> > http://focus.ti.com/lit/ug/spru281f/spru281f.pdf
> >
> > Section 6.5.4, Using Intrinsics to Access Assembly Language Statements, shows some
> > abs() intrinsics.
> >
> > -Jeff
Hi,
The _abs() function is listed in the CC help menu for the 55x arch.
marko
Jeff Brower wrote:
> Marko-
>
>
>> I have already tried intrinsics but the _abs() doesn't work for me
>> ("corelations.cpp", line 1035: error: identifier "_abs" is
undefined).
>> Am I missing something ?
>>
>
> As I recall, SPRU281f mentions _abss, _labss, etc. as the intrinsic names. The
> trailing 's' means a saturated result.
>
>
>> In general I prefer not to stick with the assembler on the 55x due to
>> the silicon issues.
>>
>
> That doesn't make any sense. In the first place, it's rare that CCS allows entry of
> silicon version in the target setup, and in those cases where it does both the
> compiler and assembler would use the information to avoid relevant silicon errata.
> Whether the compiler generates the asm instruction or you generate it isn't going to
> matter.
>
> -Jeff
>
>
>> Jeff Brower wrote:
>>
>>> Marko-
>>>
>>>
>>>
>>>> I'm working on a 5509a DSP. I trying t optimize a critical function as
>>>> much as possible. Analyzing the code showed that the most
>>>> time is spent in while calling the std::abs() function in a for loop.
>>>>
>>>> Is any of you aware of a optimized version of this function or is there
>>>> any fast algorithm to compute this ?
>>>>
>>>> By consulting the assembler users guide seems that the 55x architecture
>>>> has an dedicated assembler instruction for computing the abs value but I
>>>> don't know how to mix C/C++ instructions.
>>>>
>>>>
>>> Calling asm from C/C++ code is easy and well-documented in the TI C/C++ guide
docs.
>>> But one thing you can try as an intermediate step is intrinsics, which allow C
>>> statements to be replaced by inline statements that expand into specific asm
>>> instructions. Try here:
>>>
>>> http://focus.ti.com/lit/ug/spru281f/spru281f.pdf
>>>
>>> Section 6.5.4, Using Intrinsics to Access Assembly Language Statements, shows
some
>>> abs() intrinsics.
>>>
>>> -Jeff
>>>
>
Marko-
> The _abs() function is listed in the CC help menu for the 55x arch.
Yes I see that also... I would just say that if CCS Help and one of the SPRUs are in
conflict, go with the SPRU.
Did you try _abss()? Does it work? It shouldn't matter that your result is
saturated.
-Jeff
> Jeff Brower wrote:
> > Marko-
> >
> >
> >> I have already tried intrinsics but the _abs() doesn't work for me
> >> ("corelations.cpp", line 1035: error: identifier "_abs" is
undefined).
> >> Am I missing something ?
> >>
> >
> > As I recall, SPRU281f mentions _abss, _labss, etc. as the intrinsic names. The
> > trailing 's' means a saturated result.
> >
> >
> >> In general I prefer not to stick with the assembler on the 55x due to
> >> the silicon issues.
> >>
> >
> > That doesn't make any sense. In the first place, it's rare that CCS allows entry of
> > silicon version in the target setup, and in those cases where it does both the
> > compiler and assembler would use the information to avoid relevant silicon errata.
> > Whether the compiler generates the asm instruction or you generate it isn't going to
> > matter.
> >
> > -Jeff
> >
> >
> >> Jeff Brower wrote:
> >>
> >>> Marko-
> >>>
> >>>
> >>>
> >>>> I'm working on a 5509a DSP. I trying t optimize a critical function as
> >>>> much as possible. Analyzing the code showed that the most
> >>>> time is spent in while calling the std::abs() function in a for loop.
> >>>>
> >>>> Is any of you aware of a optimized version of this function or is there
> >>>> any fast algorithm to compute this ?
> >>>>
> >>>> By consulting the assembler users guide seems that the 55x architecture
> >>>> has an dedicated assembler instruction for computing the abs value but I
> >>>> don't know how to mix C/C++ instructions.
> >>>>
> >>>>
> >>> Calling asm from C/C++ code is easy and well-documented in the TI C/C++ guide
docs.
> >>> But one thing you can try as an intermediate step is intrinsics, which allow
C
> >>> statements to be replaced by inline statements that expand into specific asm
> >>> instructions. Try here:
> >>>
> >>> http://focus.ti.com/lit/ug/spru281f/spru281f.pdf
> >>>
> >>> Section 6.5.4, Using Intrinsics to Access Assembly Language Statements, shows
some
> >>> abs() intrinsics.
> >>>
> >>> -Jeff
I'll try the _abss() in the next days and I will post the results.
Thanks,
marko
Jeff Brower wrote:
> Marko-
>
>
>> The _abs() function is listed in the CC help menu for the 55x arch.
>>
>
> Yes I see that also... I would just say that if CCS Help and one of the SPRUs are in
> conflict, go with the SPRU.
>
> Did you try _abss()? Does it work? It shouldn't matter that your result is
> saturated.
>
> -Jeff
>
>
>> Jeff Brower wrote:
>>
>>> Marko-
>>>
>>>
>>>
>>>> I have already tried intrinsics but the _abs() doesn't work for me
>>>> ("corelations.cpp", line 1035: error: identifier "_abs" is
undefined).
>>>> Am I missing something ?
>>>>
>>>>
>>> As I recall, SPRU281f mentions _abss, _labss, etc. as the intrinsic names. The
>>> trailing 's' means a saturated result.
>>>
>>>
>>>
>>>> In general I prefer not to stick with the assembler on the 55x due to
>>>> the silicon issues.
>>>>
>>>>
>>> That doesn't make any sense. In the first place, it's rare that CCS allows entry
of
>>> silicon version in the target setup, and in those cases where it does both the
>>> compiler and assembler would use the information to avoid relevant silicon
errata.
>>> Whether the compiler generates the asm instruction or you generate it isn't going
to
>>> matter.
>>>
>>> -Jeff
>>>
>>>
>>>
>>>> Jeff Brower wrote:
>>>>
>>>>
>>>>> Marko-
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> I'm working on a 5509a DSP. I trying t optimize a critical function
as
>>>>>> much as possible. Analyzing the code showed that the most
>>>>>> time is spent in while calling the std::abs() function in a for loop.
>>>>>>
>>>>>> Is any of you aware of a optimized version of this function or is
there
>>>>>> any fast algorithm to compute this ?
>>>>>>
>>>>>> By consulting the assembler users guide seems that the 55x
architecture
>>>>>> has an dedicated assembler instruction for computing the abs value but
I
>>>>>> don't know how to mix C/C++ instructions.
>>>>>>
>>>>>>
>>>>>>
>>>>> Calling asm from C/C++ code is easy and well-documented in the TI C/C++
guide docs.
>>>>> But one thing you can try as an intermediate step is intrinsics, which
allow C
>>>>> statements to be replaced by inline statements that expand into specific
asm
>>>>> instructions. Try here:
>>>>>
>>>>> http://focus.ti.com/lit/ug/spru281f/spru281f.pdf
>>>>>
>>>>> Section 6.5.4, Using Intrinsics to Access Assembly Language Statements,
shows some
>>>>> abs() intrinsics.
>>>>>
>>>>> -Jeff
>>>>>
>