How to Remove All Line Breaks from Text Using Regex

If you are generating <meta> description tags automatically (e.g. by including all headings of the document), chances are that you’re extracting it from various sources of content that contain different HTML elements and line breaks in them. 

Here is a simple regular expression to remove all line breaks, carriage returns and tabs, and replace them with an empty space.

$text = preg_replace( '/(\r\n)+|\r+|\n+|\t+/', ' ', $text )

It works by replacing all instances of Windows and unix line breaks and tabs with a blank space character.

9 Comments

  1. Shan says:

    Hi Kaspars,
    This really works good and saved a lot of time for me….!
    Thanks a lot….! :)

  2. Larry says:

    Sweet. Thanks man, saved me a lot of time!

  3. Zamicol says:

    Shouldn’t it be (\r\n)+|\r+|\n+?

    “+” will only apply to the immediately preceding character. “\r\n” needs to be grouped. I believe “\r\n+” is functionally equivalent to “\r\n|\n+”.

    • Kaspars says:

      Thanks Zamicol, I think you’re right — I have updated the post.

    • Rob says:

      \r is carriage return (moves the cursor horizontally back to the the left of the page)
      \n is new line (moves the cursor down one line)

      They’re anachronisms from the typewrite age: you could press ‘enter’ and drop the cursor down to the next line on the page (actually it raised the paper instead, but same result)

      However you had to, in many models, manually grab the carriage (the bit that moved across your page as you typed) and push it back over to the left. On many models there was a bell that chimed when you were approaching the right margin of your page, signaling that you should end your word, press enter (\n) and then return the carriage to the starting position on the left margin (\r).

      It would appear that some people did this in a different order, as the classic ordering of \r\n (or 13 followed by 10, in ASCII), is carriage return first, followed by the newline.

      I’d be interested to hear if anyone knows of why it’s \r\n (13,10) instead of \n\r (10,13).

  4. Camille says:

    Cool, thanks for this! :)

    How can I make it replace line breaks with a comma instead of an empty space?

  5. Roland says:

    The /i modifier for case insensitive matching is a bit silly here. There are no uppper or lower case line breaks.

Leave a Reply