How to stop non-UTF-8 characters from breaking your Wordpress feeds

If you have problems with non-UTF-8 characters breaking your feeds in Wordpress (ie. breaking XML parsers), one solution is to attach a filter to the the_excerpt_rss() function and stripping or converting the characters. I’m guessing the errant off-character characters (ahem) are the result of promiscuous copypasting.

  1. Grab Jason Judge’s self-contained function for limiting to valid UTF-8 characters (here’s a link to the source).
  2. Paste it into your theme’s functions.php file.
  3. Also add the following lines:
    function the_excerpt_rss_utf8($text) {
    return trim(clean_utf8_xml_string($text));
    }
    add_filter('the_excerpt_rss', 'the_excerpt_rss_utf8');
Ilya
  • Home
  • About
  • Preferences


RSS

March 2010

Mon

Tue

Wed

Thu

Fri

Sat

Sun

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

Feb   Apr

Beared souls

caught together